Activation induced deaminase (aid)

ABSTRACT

The invention is directed to a cell comprising a nucleic acid encoding an Activation Induced Deaminase (AID) polypeptide, a fusion protein comprising an AID polypeptide, and methods of using a nucleic acid encoding an AID polypeptide.

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application is a divisional of copending U.S. patentapplication Ser. No. 12/911,292, filed Oct. 25, 2010, which is adivisional of U.S. patent application Ser. No. 10/985,321, filed Nov.10, 2004, issued as U.S. Pat. No. 7,820,442, which is a continuation ofInternational Application No. PCT/GB03/02002, filed May 9, 2003, theentirety of which is incorporated herein by reference.

INCORPORATION-BY-REFERENCE OF MATERIAL SUBMITTED ELECTRONICALLY

Incorporated by reference in its entirety herein is a computer-readablenucleotide/amino acid sequence listing submitted concurrently herewithand identified as follows: One 7,161 Byte ASCII (Text) file named“711017SequenceListing.TXT,” created on Sep. 5, 2012.

The present invention identifies that the expression of ActivationInduced Deaminase (AID) and its homologues, such as Apobec, in cellsconfers a mutator phenotype and thus provides a method for generatingdiversity in a gene or gene product as well as cell lines capable ofgenerating diversity in defined gene products. The invention alsoprovides methods of modulating a mutator phenotype by modulating theexpression or activity of AID or its homologues.

BACKGROUND

In normal cells, a low mutation rate ensures genetic stability and thisdepends on effective DNA repair mechanisms for repairing the manyaccidental changes that occur continually in DNA.

However, during the generation of antibodies, point mutations occurwithin the V-region coding sequence of the antigen receptor loci and therate of mutation observed, called somatic hypermutation, is about amillion times greater than the spontaneous mutation rate in other genes.The antigen receptor loci are the only loci in human cells that undergoprogrammed genetic alterations. However, the mechanisms that allow thenucleotide changes to be controlled and targeted to the DNA of aprecisely specified part of the genome in this way is not known.

Functional antigen receptors are assembled by RAG-mediated generearrangement and the isotype switch from IgM to IgG, IgA and IgE iseffected by class switch recombination. Aberrant forms of RAG-mediatedgene rearrangement and class switch recombination have been shown tounderpin many of the chromosomal translocations associated with lymphoidmalignancies. In the case of somatic hypermutation, it was proposedseveral years ago by Rabbitts et al (1984 Nature 309, 592-597) that thechromosomal translocations which bring the c-myc proto-oncogene into thevicinity of the IgH locus could make it a substrate for the antibodyhypermutation mechanism. Recent evidence using hypermutating cell lineshas provided evidence in support of this (Bemark, M and Neuberger, M. S.2000 Oncogene 19, 3404-3410). A wider role for aberrant hypermutationcame with the finding that several genes apart from the immunoglobulin Vgenes can (without being translocated into the Ig loci) apparently actas substrates for the antibody hypermutation mechanism in that theyexhibit an increased frequency of point mutation in hypermutating Bcells. Recent evidence also points to a high frequency of mutations inmany B cell tumours and it has been proposed that this is a result of atransient hypermutation phase caused by the antibody hypermutationmechanism. In all these cases, the aberrant mutations are largely atdC/dG residues.

An uncontrolled and enhanced rate of mutation in non-antibody producingcells can also be deleterious. For example, mutations are the hallmarkof cancer and the enhanced rate of mutation in cancer cells may explaintheir capability to continually grow and evade the normal humandefences. The “mutator phenotype” hypothesis attributes this phenomenonto an increasing rate of errors in DNA replication as a tumour grows.According to this theory, genes encoding proteins normally interactingwith nucleotides such as DNA polymerases and DNA repair enzymes may befaulty in cancer cells and therefore cause subsequent mutations.

In vitro, understanding and harnessing the means for controlling anenhanced rate of mutation can be usefully employed, for example, ingenerating diversity of gene products such as generating antibodydiversity.

Many in vitro approaches to the generation of diversity in gene productsrely on the generation of a very large number of mutants which are thenselected using powerful selection technologies. For example, phagedisplay technology has been highly successful as providing a vehiclethat allows for the selection of a displayed protein (Smith, G. P. 1985Science, 228, 1315-7; Bass et al. Proteins. 8, 309-314, 1990; McCaffertyet al., 1990 Nature, 348, 552-4; for review see Clackson and Wells, 1994Trends Biotechnol, 12, 173-84). Similarly, specific peptide ligands havebeen selected for binding to receptors by affinity selection using largelibraries of peptides linked to the C terminus of the lac repressor Lacl(Cull et al., 1992 Proc Natl Acad Sci USA, 89, 1865-9). When expressedin E. coli the repressor protein physically links the ligand to theencoding plasmid by binding to a lac operator sequence on the plasmid.Moreover, an entirely in vitro polysome display system has also beenreported (Mattheakis et al., 1994 Proc Natl Acad Sci USA, 91, 9022-6) inwhich nascent peptides are physically attached via the ribosome to theRNA which encodes them.

Artificial selection systems to date rely heavily on initial mutationand selection, similar in concept to the initial phase DNA rearrangementinvolving the joining of immunoglobulin V, D and J gene segments whichoccurs in natural antibody production, in that it results in thegeneration of a “fixed” repertoire of gene product mutants from whichgene products having the desired activity may be selected.

Unlike in the natural immune system, however, artificial selectionsystems are poorly suited to any facile form of “affinity maturation”,or cyclical steps of repertoire generation and development. One of thereasons for this is that it is difficult to generate enough mutationsand to target these to regions of the molecule where they are required,so subsequent cycles of mutation and selection do not lead to theisolation of molecules with improved activity with sufficientefficiency.

In vivo, after the primary repertoire of antibody specificities iscreated by V-D-J rearrangement, and following antigen encounter in mouseand man, the rearranged V genes in those B cells that have beentriggered by the antigen are subjected to two further types of geneticmodification. Class switch recombination, a region-specific but largelynon-homologous recombination process, leads to an isotype change in theconstant region of the expressed antibody. Somatic hypermutationintroduces multiple single nucleotide substitutions in and around therearranged V gene segments. This hypermutation generates the secondaryrepertoire from which good binding specificities can be selected therebyallowing affinity maturation of the humoral immune response. In chickenand rabbits (but not man or mouse) an additional mechanism, geneconversion, is a major contributor to V gene diversification.

Much of what is known about the somatic hypermutation process whichoccurs during affinity maturation in natural antibody production hasbeen derived from an analysis of the mutations that have occurred duringhypermutation in vivo (for reviews see Neuberger and Milstein, 1995Curr. Opin. Immunol. 7, 248-254; Weill and Reynaud, 1996 Immunol Today17, 92-97; Parham, 1998 Immunological Reviews, Vol. 162 (Copenhagen,Denmark: Munksgaard)). Most of these mutations are single nucleotidesubstitutions which are introduced in a stepwise manner. They arescattered over the rearranged V domain, though with characteristichotspots, and the substitutions exhibit a bias for base transitions. Themutations largely accumulate during B cell expansion in germinal centres(rather than during other stages of B cell differentiation andproliferation) with the rate of incorporation of nucleotidesubstitutions into the V gene during the hypermutation phase estimatedat between 10⁻⁴ and 10⁻³ bp⁻¹ generation⁻¹ (McKean et al., 1984; Berek &Milstein, 1988). However, a greater understanding of the steps involvedin these later stages of hypermutation would enable a more diverse rangeof gene products to be obtained.

All three of the above processes, somatic hypermutation, gene conversionand class-switch recombination, have been shown to depend upon activityof the protein Activation Induced Deaminase (AID) (Muramatsu et al.(1999); Muramatsu M. et al. (2000); Revy, P. et al. (2000); Arakawa, H.et al. (2002); Harris, R. S. et al. (2002); Martin, A. et al. (2002) andOkazaki, I. et al. (2002)) which has been suggested (by virtue of itshomology with Apobec-1 (Muramatsu et al. (1999)) to act by RNA editing.However, evidence that the three processes could be initiated by acommon type of DNA lesion (Maizels et al. (1995); Weill et al. (1996);Sale et al. (2001); Ehrenstein et al. (1999)) taken with the fact thatfirst phase of hypermutation targets dG/dC (Martin et al. (2002); Radaet al. (1998); Wiesendanger et al. (2000)) has suggested that AID mayact directly on dG/dC pairs in the immunoglobulin locus. However, todate, the actual function of AID has not been described.

The AID homologue Apobec-1 has been identified as playing a role inmodifying RNA. Apobec-1 is a catalytic component of the apolipoprotein B(apoB) RNA editing complex that performs the deamination of C₆₆₆₆ to Uin intestinal apoB RNA thereby generating a premature stop codon.Indeed, the oncogenic activity of Apobec-1, identified by itsoverexpression in transgenic mice, has previously been attributed to itsRNA editing activity acting on inappropriate substrates.

Deamination of cytosine to uracil can occur in vivo at the level ofnucleotide and in DNA as well as RNA. In the context of DNA, the lowlevel deamination of cytosine to uracil which takes place spontaneously(and which might be of relatively minor significance when it occurs withfree nucleotides or in mRNA) can have major effects, contributing togenome mutation, cancer and evolution (Lindahl, T. (1993) Nature 362,709-715). However, to date, there is no biochemical evidence that APOBECfamily members can trigger such deamination in vitro.

SUMMARY OF THE INVENTION

The present inventors have demonstrated that expression of AID inEscherichia coli gives a mutator phenotype yielding DNA nucleotidetransitions at dG/dC. The mutation frequency is enhanced by deficiencyin uracil-DNA glycosylase indicating that AID acts by deaminating dCresidues in DNA.

In addition, the expression of AID homologues, Apobec-1, Apobec3C andApobec3G, including their expression as part of a fusion protein in E.coli, also yields a mutator phenotype and these homologues show anincreased potency of mutator activity on DNA sequences when compared toAID.

Furthermore, deamination of cytosine to uracil in DNA can be achieved invitro using partially purified APOBEC1 from extracts of transformedEscherichia coli. Its activity on DNA is specific for single-strandedDNA and exhibits dependence on local sequence context.

Accordingly, in a first aspect of the invention there is provided a cellmodified to express AID, or an AID variant, derivative or homologue, andhaving a mutator phenotype.

Suitably, the cell is modified to stably express AID, or an AID variant,derivative or homologue, and having a mutator phenotype.

By “stable expression” of a gene is meant that the gene and itsexpression is substantially maintained in successive generations ofcells derived from transfected cells. In particular, the term “stableexpression” is not intended to encompass the transient expression of aprotein in a bacterial cell for the purpose of protein purification.

In another embodiment, the cell is transiently transfected to expressAID, or an AID variant, derivative or homologue, and having a mutatorphenotype.

As used herein, “mutator phenotype” means an increased mutationfrequency in the transfected cells modified to express AID or itshomologues when compared to non-modified, non-transfected cells. Methodsfor measuring mutation frequency are described herein. Suitably themutations are nucleotide transitions at dG/dC as a result of deaminationof dC residues in DNA. The term “mutator activity” refers to theactivity that confers the mutator phenotype.

In one embodiment, said cell is a prokaryotic cell, such as bacteria.Suitable bacteria include E. coli.

In another embodiment, the cell is a modified eukaryotic cell in whichaltered AID expression has been induced by introduction of AID gene withthe proviso that said eukaryotic cell is not a cell of the human Blymphocyte lineage and, in particular, is not a human Ramos, BL-2 orCL-01 cell nor a cell derived from the chicken cell line, DT40. Suitablysaid cell is derived from mouse or man and is capable of generatingimmunoglobulin diversity through somatic hypermutation or classswitching.

In another embodiment, the AID homologue is Apobec and is, inparticular, selected from Apobec family members such as Apobec-1,Apobec3C or Apobec3G (described, for example, by Jarmuz et al (2002)).

In yet another embodiment, the AID variant is a fusion protein.Suitably, said fusion protein is AID, Apobec-1, Apobec3C or Apobec3G inwhich a heterologous protein or peptide domain has been fused at eitherits N- or C-terminus. Preferably, the heterologous peptide is fused atthe amino terminus. Suitably, said heterologous peptide domain is abinding domain which is one half of a specific binding pair which caninteract with the second half of said pair to form a complex. Suitablebinding pairs include two complementary components which can bind in aspecific binding reaction. Examples of specific binding pairs includeHis-tag—Nickel, DNA binding domain—DNA binding domain recognitionsequence, antibody—antigen, Biotin—Streptavidin etc.

The data presented herein are consistent with AID or its homologuesactivating deamination of dC as an enhancement of the effect is observedin cells lacking uracil-DNA glycosylase (UDG).

Accordingly, in another embodiment, said cell further comprises agenetic background which confers an enhanced mutator phenotype effect.In a particularly preferred embodiment, the genetic background of aprokaryotic cell confers a UDG deficiency on the cell. Said UDGdeficiency is preferably induced by interfering with UDG expression suchas, for example, creating a ung-background. In some E. coli ung-1mutants, some back up UDG activity is provided by the product of the muggene. Thus, in a further embodiment, the cell comprises a combinedbackground of ung- and mug-.

The introduction of modified expression of AID or an AID homologue intoa cell can increase the mutation rate above the background mutation ratethat would normally be observed in that cell. Suitably, the modifiedcell is capable of generating mutations in a defined gene product. Thiscan be particularly useful in the generation of gene diversity forexample in the generation of antibody diversity where the defined geneproduct is an immunoglobulin V region gene.

Such cells according to any embodiment of the first aspect anddisplaying an enhanced rate of mutation can be useful in a method forpreparing a gene product having a desired activity.

Preferably the gene product which is desired to mutate is provided toAID or its homologues as single-stranded DNA. Single stranded DNA may beprovided by introducing single stranded DNA directly or by introducingdouble stranded DNA which is later converted to single stranded, forexample, through enzymatic action such as helicase or transcriptaseactivity.

In another aspect of the invention, there is provided a fusion proteincomprising an AID, or AID variant, derivative or homologue, polypeptidehaving a mutator phenotype operably linked to one half of a specificbinding pair.

The term “operably linked” refers to a juxtaposition wherein thecomponents described are in a relationship permitting them to functionin their intended manner. For example, an AID polypeptide “operablylinked” to one half of a specific binding pair is linked throughligation of the nucleic acid coding sequences or otherwise such that afusion protein is produced in which the mutator activity of AID isunimpaired whilst allowing the specific binding pair to form throughinteraction of the said one half with its complement.

In a preferred embodiment, the one half of the specific binding pair insaid fusion protein is a DNA binding domain.

Preferably, the AID homologue is one of the Apobec family of proteinsand, suitably, is selected from the group consisting of Apobec-1,Apobec-3G and Apobec-3C.

In another aspect of the invention, there is provided a vector forexpressing a fusion protein in accordance with the previous aspect.

In yet another aspect of the invention, there is provided a cellmodified to express a fusion protein in accordance with that aspect ofthe invention.

The mutator activity of AID can be harnessed to drive mutation ofspecific gene products of interest. Accordingly, in a further aspect ofthe invention there is provided a method for preparing a gene producthaving a desired activity, comprising the steps of:

-   -   a) expressing a nucleic acid encoding the gene product in a        population of cells according to the invention;    -   b) identifying a cell or cells within the population of cells        which expresses a mutant gene product having the desired        activity; and    -   c) establishing one or more clonal populations of cells from the        cell or cells identified in step (b), and selecting from said        clonal populations a cell or cells which expresses a gene        product having an improved desired activity.

In one embodiment, the nucleic acid encoding the gene product isavailable to AID or an AID homologue as single-stranded DNA.

Suitably, the nucleic acid encoding the gene product is operably linkedto one component of a specific binding pair. In this embodiment, anucleic acid operably linked to the one component, or second half, of aspecific binding pair is ligated in such a way that the binding of theother component, or first half, of a specific binding pair can takeplace. Thus, where the first half of specific binding pair is linked ina fusion protein to the AID polypeptide having mutator activity, bindingof the first and second halves of the specific binding pairs brings themutator protein into range with the nucleic acid sequence such thatdirected mutation of that particular nucleic acid sequence can takeplace.

In a particularly preferred embodiment, the specific binding pair is aDNA binding protein-DNA binding protein recognition sequence. In thisembodiment, the population of cells comprises cells expressing a fusionprotein being a fusion of AID polypeptide to a DNA binding protein (orDNA binding domain) and the nucleic acid sequence encoding the geneproduct is operably linked to the DNA binding protein recognitionsequence. This would allow the mutator activity of AID or its homologuesto be specifically directed to the nucleic acid encoding the geneproduct of interest.

Accordingly, in another aspect of the invention, there is provided amethod for directing mutation to a specific gene product of interest.Suitably said method comprises the steps of:

-   i) generating a nucleic acid construct comprising a nucleic acid    sequence encoding a gene product operably linked to a DNA binding    protein recognition sequence;-   ii) transfecting said nucleic acid construct into a population of    host cells expressing a fusion protein in accordance with the    invention;-   iii) incubating said transfected host cells under conditions    suitable for allowing the specific binding pairing of DNA binding    protein to DNA binding protein recognition sequence to occur; and-   iv) identifying a cell or cells within the population of cells which    expresses a mutant gene product having the desired activity; and-   v) establishing one or more clonal populations of cells from the    cell or cells identified in step (iv), and selecting from said    clonal populations a cell or cells which expresses a gene product    having an improved desired activity.

Suitably said host cells may be prokaryotic, bacterial cells such as E.coli or they may be eukaryotic cells such as yeast or mammalian cells.

In one embodiment, the population of cells in accordance with theinvention is derived from a clonal or polyclonal population of cellswhich comprises cells capable of constitutive hypermutation of V regiongenes.

The gene product may be an endogenous gene product such as theendogenous immunoglobulin polypeptide, a gene product expressed by amanipulated endogenous gene or a gene product expressed by aheterologous transcription unit operatively linked to control sequenceswhich direct somatic hypermutation, as described further below. In thisembodiment, the gene product is operably linked to a nucleic acid whichdirects hypermutation.

Alternatively, the gene product may be a heterologous gene product.

The nucleic acid which is expressed in the cells of the invention andsubjected to hypermutation may be an endogenous region, such as theendogenous V region, or a heterologous region inserted into the cellline of the invention. This may take the form, for example, of areplacement of the endogenous V region with heterologous transcriptionunit(s), such as a heterologous V region, retaining the endogenouscontrol sequences which direct hypermutation; or of the insertion intothe cell of a heterologous transcription unit under the control of itsown control sequences to direct hypermutation, wherein the transcriptionunit may encode V region genes or any other desired gene product. Thenucleic acid according to the invention is described in more detailbelow.

In another embodiment the gene product may be an endogenous gene productwhich is not normally subject to hypermutation. Suitable gene productsinclude genes implicated in disease, oncogenes and other target genes.Thus, the gene product may be any gene product in which mutation isdesirable.

In one embodiment, the endogenous or heterologous gene may be integratedinto a chromosome.

In step b) or step (iv) above, the cells are screened for the desiredgene product activity. This may be, for example in the case ofimmunoglobulins, a binding activity. Other activities may also beassessed, such as enzymatic activities or the like, using appropriateassay procedures. Where the gene product is displayed on the surface ofthe cell, cells which produce the desired activity may be isolated bydetection of the activity on the cell surface, for example byfluorescence, or by immobilising the cell to a substrate via the surfacegene product. Where the activity is secreted into the growth medium, orotherwise assessable only for the entire cell culture as opposed to ineach individual cell, it is advantageous to establish a plurality ofclonal populations from step a) in order to increase the probability ofidentifying a cell which secretes a gene product having the desiredactivity. Advantageously, the selection system employed does not affectthe cell's ability to proliferate and mutate.

Preferably, at this stage (and in step c) or step v)) cells whichexpress gene products having a better, improved or more desirableactivity are selected. Such an activity is, for example, a higheraffinity binding for a given ligand, or a more effective enzymaticactivity. Thus, the method allows for selection of cells on the basis ofa qualitative and/or quantitative assessment of the desired activity.Successive rounds of selection may allow for directed evolution in agene product. Selection of mutants may also be achieved by growth orselection on selective media as described herein.

In a preferred embodiment, the “population of cells” in the method is apopulation of prokaryotic cells. In another embodiment, the “populationof cells” is a population of yeast cells.

The targeted mutation of a specific gene product of interest can beenhanced by providing the nucleic acid encoding the gene product in amodified construct. Suitably the construct is arranged such to favourgeneration of a single-stranded substrate oligonucleotide (i.e. thenucleic acid encoding the gene product of interest). An increasedavailability of single stranded DNA can be achieved by providing thesubstrate oligonucleotide between two convergent promoters. In oneembodiment, this construct favours the generation of single stranded DNAthrough DNA bending caused by promoter activity. In another embodiment,this construct favours single stranded DNA through bi-directionaltranscription activation.

Accordingly, in another aspect of the invention there is provided aconstruct for use in a method in accordance with the invention saidconstruct comprising a nucleic acid encoding the gene product ofinterest wherein said nucleic acid is placed under the control of afirst promoter upstream of the coding sequence and further comprising asecond promoter downstream of the coding sequence in the oppositeorientation. Such a construct may be referred to as a construct forconvergent transcription.

A number of suitable promoter sequences are known to those skilled inthe art. For example, suitable Prokaryotic promoters include Activatorssuch as AraBAD, PhoA, Repressors such as Tet, Lac, Trp, Hybrid Lac/Trpsuch as Tac, pL and Regulatable hybrids of pL such as pL-tet or ViralPolymerase, such as T7. Suitable Eukaryotic promoters include, forexample, RNA Polymerase I (e.g. 45S rDNA), RNA Polymerase II (e.g. Gal4,β-Actin, Viral promoters, such as CMV-IE and Artificial promotersincluding Tet-on, Tet-off) or RNA Polymerase III promoters including H1RNA and U6 snRNA. In particular, promoters include the PhoB promoter andinducible promoters such as IPTG inducible Trc promoter. Suitably saidconstruct is as described in the examples section herein.

In another aspect of the invention there is provided a method ofidentifying components of AID-dependant mutation activity comprisingexpressing AID in a cell deficient in a particular gene and assessingmutator activity compared to activity in a cell expressing said gene.

By “components of AID dependant mutation activity” is meant aspects orcellular components which contribute to the molecular role of AID (orits homologues) and includes proteins or nucleic acid components whichinteract with AID in its mutator function.

In a further aspect of the invention there is provided a method ofscreening for a modulator of AID activity comprising:

-   -   expressing AID in a prokaryotic cell;    -   maintaining the AID-expressing prokaryotic cell in the presence        of a selectable medium;    -   detecting the presence of colonies in the absence or presence of        a test compound wherein a modified number of colonies when        compared to a sample in the absence of a test compound is        indicative of the ability of the test compound to modify AID        mutator activity.

By “AID activity” is meant activity of AID or any of its homologues.

Preferably, the modified number of colonies in the presence of the testcompound is an increased number and is therefore indicative of enhancedAID-mediated mutation.

In another aspect of the invention there is provided a method ofconferring a mutator phenotype on a cell comprising expressing AID orits homologues in a cell.

Modifying a cell to confer an increased frequency of mutations byintroducing AID expression is equivalent to a method of introducingmutations into a cell comprising expressing AID, the mutator protein.

In another aspect of the invention, there is provided a use of AID or afunctional homologue thereof in triggering mutation in a cell. Inparticular, there is provided a use of AID to introduce nucleotidetransitions at dG/dC as a result of deamination of dC residues in DNA.

There are several members of the AID/apobec/phorbolin family in humans(Jarmuz et al. (2002)). Indeed, overexpression of Apobec-1 is oncogenicin mice (Yamanaka S. et al. (1995)) and Apobec-1 family members areexpressed in many tumour cell lines. The mutator activity demonstrateherein provides a molecular explanation for the mechanism for thisoncogenesis. Tumour cells generally show an enhanced rate of mutationcompared with non-tumour cells with mutations at dC/dG being the mostcommon nucleoside substitutions. Thus, the ability to modulate geneproducts that trigger mutation provides a method of treating disorderscharacterised by an increased mutation rate, such as cancer.

Accordingly, in another aspect of the invention there is provided amethod for treating a disorder characterised by increased mutationscomprising treating an individual having such a disorder with an agentthat modifies AID or AID homologue functional activity or geneexpression. Suitably the disorder is selected from cancer, autoimmunedisease or other disorders in which increased mutations are correlatedwith the disease phenotype.

In one embodiment said treatment may be prophylactic i.e. a preventativetreatment. This is particularly applicable to treatment of an individualthat may be predisposed to the development of a specific disorder. Forexample, an individual may be predisposed to develop a cancer through,for example, overexpression of AID or its homologues. In such anindividual prophylactic treatment with an agent that modifies AID or AIDhomologue functional activity or gene expression may act to prevent thecondition developing.

In a preferred embodiment of this aspect, the AID homologue is Apobec-1,Apobec-3G or Apobec-3C.

The development of resistance to antibiotics by a population of bacteriais a problem in treatment of everyday infections. The ability todecrease the rate at which mutations conferring the development ofantibiotic resistance would be desirable. Understanding the role of AIDin generating mutations along with the observation that bacterial cellsexpress proteins having a similar activity to AID (see, for example,Shen et al. (1992); Navaratnam et al. (1998)) enables modification of anAID-like mutator activity in bacteria to modify the rate at whichantibiotic resistance arises. Accordingly in another aspect of theinvention there is provided a method of decreasinghypermutation/resistance to a compound such as an antibiotic in apopulation of bacteria by modulating bacterial AID-like activity.

DEFINITIONS

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art (e.g., in cell culture, molecular genetics, nucleic acidchemistry, hybridisation techniques and biochemistry). Standardtechniques are used for molecular, genetic and biochemical methods. See,generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2ded. (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.and Ausubel et al., Short Protocols in Molecular Biology (1999) 4^(th)Ed, John Wiley & Sons, Inc.; as well as Guthrie et al., Guide to YeastGenetics and Molecular Biology, Methods in Enzymology, Vol. 194,Academic Press, Inc., (1991), PCR Protocols: A Guide to Methods andApplications (Innis, et al. 1990. Academic Press, San Diego, Calif.),McPherson et al, PCR Volume 1, Oxford University Press, (1991), Cultureof Animal Cells: A Manual of Basic Technique, 2nd Ed. (R. I. Freshney.1987. Liss, Inc. New York, N.Y.), and Gene Transfer and ExpressionProtocols, pp. 109-128, ed. E. J. Murray, The Humana Press Inc.,Clifton, N.J.). These documents are incorporated herein by reference.

The abbreviations used herein include: APOBEC1, apolipoprotein B editingcomplex catalytic subunit 1; AID, activation-induced deaminase; TLC,thin-layer chromatography; PEI, polyethylene imine; UDG, uracil-DNAglycosylase.

The terms “variant” or “derivative” in relation to AID polypeptideincludes any substitution of, variation of, modification of, replacementof, deletion of or addition of one (or more) amino acids from or to thepolypeptide sequence of AID. Preferably, nucleic acids encoding AID areunderstood to comprise variants or derivatives thereof.

Such “modifications” of AID polypeptides include fusion proteins inwhich AID polypeptide or a portion or fragment thereof is linked to orfused to another polypeptide or molecule.

The term “homologue” as used herein with respect to the nucleotidesequence and the amino acid sequence of AID may be synonymous withallelic variations in the AID sequences and includes the knownhomologues, for example, Apobec-1 and other Apobec homologues includingApobec3C, Apobec3G, phorbolin and functional homologues thereof.

The “functional activity” of a protein in the context of the presentinvention describes the function the protein performs in its nativeenvironment. Altering or modulating the functional activity of a proteinincludes within its scope increasing, decreasing or otherwise alteringthe native activity of the protein itself. In addition, it also includeswithin its scope increasing or decreasing the level of expression and/oraltering the intracellular distribution of the nucleic acid encoding theprotein, and/or altering the intracellular distribution of the proteinitself. By “AID mutation activity” or “mutator activity” is meant thefunctional activity of AID or its homologues to increase mutation abovebackground.

The term “expression” refers to the transcription of a genes DNAtemplate to produce the corresponding mRNA and translation of this mRNAto produce the corresponding gene product (i.e., a peptide, polypeptide,or protein). The term “activates gene expression” refers to inducing orincreasing the transcription of a gene in response to a treatment wheresuch induction or increase is compared to the amount of gene expressionin the absence of said treatment. Similarly, the terms “decreases geneexpression” or “down-regulates gene expression” refers to inhibiting orblocking the transcription of a gene in response to a treatment andwhere such decrease or down-regulation is compared to the amount of geneexpression in the absence of said treatment.

The “mutation rate” is the rate at which a particular mutation occurs,usually given as the number of events per gene per generation whereas“mutation frequency” is the frequency at which a particular mutant isfound in the population.

“Hypermutation” or “increased mutation rate” or “increased mutationfrequency” refers to the mutation of a nucleic acid in a cell at a rateabove background. Preferably, hypermutation refers to a rate of mutationof between 10⁻⁵ and 10⁻³ bp⁻¹ generation⁻¹. This is greatly in excess ofbackground mutation rates, which are of the order of 10⁻⁹ to 10⁻¹⁰mutations bp⁻¹ generation⁻¹ (Drake et al., 1998 Genetics 148:1667-1686)and of spontaneous mutations observed in PCR. 30 cycles of amplificationwith Pfu polymerase would produce <0.05×10⁻³ mutations bp⁻¹ in theproduct, which in the present case would account for less than 1 in 100of the observed mutations (Lundberg et al., 1991 Gene 108:1-6).

In vivo, hypermutation is a part of the natural generation ofimmunoglobulin diversity through generating variable chain (V) genes.According to one aspect of the present invention therefore, the cellline is preferably an immunoglobulin-producing cell line which iscapable of producing at least one immunoglobulin V gene. A V gene may bea variable light chain (V_(L)) or variable heavy chain (V_(H)) gene, andmay be produced as part of an entire immunoglobulin molecule; it may bea V gene from an antibody, a T-cell receptor or another member of theimmunoglobulin superfamily. Members of the immunoglobulin superfamilyare involved in many aspects of cellular and non-cellular interactionsin vivo, including widespread roles in the immune system (for example,antibodies, T-cell receptor molecules and the like), involvement in celladhesion (for example the ICAM molecules) and intracellular signalling(for example, receptor molecules, such as the PDGF receptor). Thus,preferred cell lines according to the invention are derived fromB-cells. According to the present invention, it has been determined thatcell lines derived from antibody-producing B cells may be isolated whichretain the ability to hypermutate V region genes, yet do not hypermutateother genes.

“Class switching” or “switch recombination” is the recombination processin V gene rearrangement that leads to a change in the constant region ofthe expressed antibody. “Gene conversion” is an additional mechanism inthe recombination process which is found to occur in chicken and rabbits(but not in human or mouse) and contributes to V gene diversification.

The term “constitutive hypermutation” refers to the ability of certaincell lines to cause alteration of the nucleic acid sequence of one ormore specific sections of endogenous or transgene DNA in a constitutivemanner, that is without the requirement for external stimulation.Generally, such hypermutation is directed. In cells capable of directedconstitutive hypermutation, sequences outside of the specific sectionsof endogenous or transgene DNA-are not subjected to mutation rates abovebackground mutation rates. The sequences which undergo constitutivehypermutation are under the influence of hypermutation-recruitingelements, as described further below, which direct the hypermutation tothe locus in question. Thus in the context of the present invention,target nucleic acid sequences, into which it is desirable to introducemutations, may be constructed, for example by replacing V genetranscription units in loci which contain hypermutation-recruitingelements with another desired transcription unit, or by constructingartificial genes comprising hypermutation-recruiting elements.

The cell population which is subjected to selection by the method of theinvention may be a polyclonal population, comprising a variety of celltypes and/or a variety of target sequences, or a (mono-) clonalpopulation of cells.

A clonal cell population is a population of cells derived from a singleclone, such that the cells would be identical save for mutationsoccurring therein. Use of a clonal cell population preferably excludesco-culturing with other cell types, such as activated T-cells, with theaim of inducing V gene hypermutation.

BRIEF DESCRIPTION OF THE TABLES AND FIGURES

Table 1 shows the results of experiments in which AID was expressed inE. coli.

Table 2 shows the results of experiments in which AID and itshomologues, Apobec-1, Apobec3C and Apobec3G were expressed in E. coli.

Table 3 shows the results of a second set of experiments in which AIDand its homologues, Apobec-1, Apobec 2, Apobec3C and Apobec3G wereexpressed in E. coli.

Table 4 shows the oligonucleotides used in Example 3.

FIGURE LEGENDS

FIG. 1 DNA deamination model of Ig gene diversification. For details,see text.

FIG. 2 Expression of AID in E. coli yields a mutator phenotype that isenhanced by UDG-deficiency. (a) Frequencies of Rif^(R) mutants generatedfollowing overnight culture (+IPTG) of E. coli KL16 carrying either theAID expression plasmid or the vector control. Each point represents themutation frequency of an independent overnight culture. The foldenhancement by AID expression is indicated. (b) Mutation frequency ofAID- and vector-transformed, UDG-deficient KL16 ung-1 cells. Performedand labeled as in (a), but note the differing y-axis scale. (c)Photograph of representative plates. The mutation frequency relative tothe vector-transformed wildtype control is indicated in the centre ofeach plate. See Table 1 for additional data.

FIG. 3 Nature of the AID-induced Rif^(R) mutants. (a) Comparison of thedistribution of independent rpoB mutations identified in Rif^(R)colonies obtained from AID- and vector-transformed cells. The data arecombined from results obtained using both KL16 and AB1157 hosts, but thetwo hosts show no difference in their mutation spectrum. The underlinedsequence (SEQ ID NO: 23 (nucleic acid sequence) and SEQ ID NO: 24 (aminoacid sequence)) is the region of rpoB which is known (Jin & Zhou (1996))to harbour the majority of mutations conferring RifR. Less than 5% ofthe Rif^(R) sequenced clones did not show any mutations in this region.(b) Comparison of the types of rpoB nucleotide substitutions identified.

FIG. 4 Comparison of the independent gyrA mutations identified inNal^(R) colonies of AID- and vector-transformed E. coli KL16. Less than5% of the Nal^(R) clones analysed failed to show mutations in thesequenced region (SEQ ID NO: 25 (nucleic acid sequence) and SEQ ID NO:26 (amino acid sequence)).

FIG. 5

(a) Frequencies of Rif^(R) mutants generated following overnight cultureof cells carrying an APOBEC1 or AID expression construct or the vectorcontrol. Each point represents the mutation frequency of an independentovernight culture. The median mutation frequency and the foldenhancement by expression of the mutator are indicated in which AID andits homologues, Apobec-1, Apobec3C and Apobec3G were expressed in E.coli.

(b) Effect of IPTG on APOBEC1-induced mutation to Rif^(R). The mutationobserved in the absence of IPTG may well be due to pTrc99A promoterleakiness. Labeled as in (a).

(c) Single amino acid changes in APOBEC1 abrogate its ability tostimulate mutation to Rif^(R). Labeled as in (a).

(d) Comparison of average growth rates of vector- and APOBEC1transformed cells propagated in the presence of the inducer IPTG. Fiveindependent cultures were used for each measurement, but the standarddeviations proved smaller than the symbols.

FIG. 6 Spectrum of Rif^(R) mutations found in cells expressing APOBEC 1.

(a) Comparison of the distribution of independent Rif^(R) mutationswithin the region of rpoB (SEQ ID NO: 27) found in cells transformedwith vector alone or an APOBEC1 expression construct. The preferredsites in AID-expressing cells are highlighted by dark boxes.

(b) Summary of the types of nucleotide substitutions in rpoB identifiedin Rif^(R) vector- and APOBEC1-transformed cells given as a percentageof the total database (120 from controls and 136 fromAPOBEC1-transformed cells).

FIG. 7 APOBEC1, APOBEC3C and APOBEC3G all stimulate mutation at dC/dGbut with distinct target specificities.

(a) Schematic of the APOBEC1 family of mutator proteins depicting theputative zinc-binding deaminase motif and the conserved leucine-richregion. Other APOBEC1 family members also contain either single (APOBEC2and APOBEC3A) or double (APOBEC3B and APOBEC3F) putative zinc-bindingmotifs (Madsen et al.). APOBEC3D and APOBEC3E may be a single proteinwith two zinc-binding regions as evidenced by IMAGE clone 3915193 or twoseparate, single zinc-binding motif proteins (Jarmuz et al.). For eachprotein, the enhancement of mutation to Rif^(R) yielded by that protein(data from Table 3), the percentage of the mutations observed that werenucleotide transitions at dC or dG and the identity of the major rpoBmutational hotspots observed (the percentage of the total number of rpoBmutations observed at that hotspot given in parentheses) are all given.The total number of mutated rpoB sequences analysed (n) for each APOBEC1family member is given.

(b) Distribution of rpoB mutations in Rif^(R) mutants obtained usingbacteria transformed with different APOBEC family members. There are 26sites within the sequenced region of rpoB where a single nucleotidesubstitution can yield Rif^(R); at 11 of these sites, Rif^(R) can beachieved by a transition at dC or dG. The percentage of the total numberof Rif^(R) mutations obtained with each APOBEC family member thatoccurred at each of these 11 sites is indicated. Mutations at othersites are not indicated (an omission which is mainly of significance tothe depiction of the vector control).

FIG. 8 a shows pRB700 construct comprising the Bacillus subtilis geneSacB under the control of the E. coli promoter for PhoB.

FIG. 8 b shows the pRB740 construct comprising a variant SacB cassetteunder the control of the PhoB promoter and also under the control of thestrong IPTG inducible Trc promoter downstream and in the oppositeorientation.

FIG. 9 shows the results of mutation analysis in mutants in the SacBcassette.

FIG. 10 a shows mutation frequency in constructs when transcription isinduced in either or both directions.

FIG. 10 b shows the results of mutation frequency analysis. pRB700 andpRB740 are described in FIG. 9. Vector control and APOBEC-1 expressionplasmids pTrc99a and pRH200 are as described (Harris et al 2002 MolCell. 10(5):1247-53). Growth media all include 100 μg/ml carbenicillinand 1 mM IPTG to maintain and induce expression plasmids. LB=LuriaBertani medium. Min MOPS=Minimal MOPS medium (Neidhardt et al 1974.Culture medium for enterobacteria. J Bacteriol. 119:736-47) using 0.1%glycerol as carbon source supplemented with 2 μM Zn²⁺ and 0.1%casamino-acids (C, D, G, H) or bacto-peptone (I).

FIG. 11 shows a table of results for mutation analysis.

FIG. 12 shows the results of assaying for DNA deaminase activity incrude extracts using the TLC-based assay.

A, Schematic representation of the TLC-based deaminase assay.α-[³²P]dCMP-labelled single-stranded DNA was incubated with theindicated extracts, purified, digested with P1 nuclease and analysed byTLC in one of two buffer systems.

B, Analysis by TLC in either the LiCl [panel (i)] or CH₃COOH+LiCl[panels (ii) and (iii)] buffer systems of the assay products ofα-[³²P]dCMP-labelled single-stranded DNA incubated with sonic extractsof E. coli transformants that carry plasmids directing theoverexpression of APOBEC1, APOBEC2, a mutant APOBEC1 (harbouring anE63->A substitution) or dCTP deaminase (DCD). Controls are provided byextracts from E. coli transformed with vector only (−) as well as bysubstrate DNA that has been subjected to chemical deamination usingbisulfite. The plasmid/host strain combination used for recombinantprotein expression was pTrc99/E. coli KL16 except where (as indicated)the pET vector was used (in which case the host strain was BL21DE3) orwhere activity was monitored using the E. coli SØ177 host (which isdeficient in both dcd and cdd deaminases). The migration of dUMP, dCMPand [³²P] inorganic phosphate (Pi) markers is indicated. The abundanceof wild-type and E63->A mutant APOBEC1 polypeptides in extracts wasmonitored by Western [lower part of panel (iii)].

FIG. 13 shows APOBEC1 fractionation.

A. Ion-exchange chromatography on Sepharose Mono-Q. Clarified lysates ofAPOBEC1 (and APOBEC1[E63->A])-expressing E. coli were loaded ontoMono-Q. The presence of APOBEC1 polypeptide was detected by Western blot[panel (ii)]. Deaminase activity was monitored by both TLC- andUDG-based assays [panels (i) and (iii)] in the total lysate (T), theflow through (FT) and in the 800 and 1000 mM-salt washes.

B, Gel filtration of the concentrated high (>1 M) salt eluate from theMono-Q column on Sephacryl S200. Fractions were analysed by: (i)SDS/PAGE; bands were excised and analysed by MALDI-TOF following in-geltrypsin digestion. The bands yielding peptide sequences derived fromAPOBEC1 and ribosomal proteins L1, 2, 6 and 9 and S4 are indicated. M,molecular weight markers. (ii) Western blotting for APOBEC1; (iii)TLC-based and (iv) UDG-based deaminase assays, which were performed onsamples of the total clarified bacterial lysate (T) as well as on theeluate from the Mono-Q. The UDG-based deaminase assay was performedusing 3′-α-[³²P]-labelled SPM274; note that some of the 3′-label isremoved during the incubation. The percentage of label associated withthe 26-base product of the deamination/cleavage (as opposed to 40-baseinput oligonucleotide) is indicated.

FIG. 14 shows specificity of APOBEC1-mediated DNA deamination using theUDG-based assay.

A, Schematic representation of the UDG-based deaminase assay.5′-biotinylated (circle) oligonucleotides that were 3′labelled(asterisk) with fluorescein or α-[³²P]dideoxyadenylate were incubatedwith APOBEC1-containing (or control) samples prior to streptavidinpurification, UDG-treatment and PAGE-urea analysis.

B, Partially purified APOBEC1 as well as the E63->A mutant were testedfor their ability to deaminate 3′-fluorescein conjugated oligonucleotideSPM168 using the UDG-based assay. The fluorescence scan of the gel,including controls performed without UDG treatment or without APOBEC1,is shown with the positions of the expected products and size markersindicated.

C, Time-course of SPM168 deamination by partially purified APOBEC1.

D, Inclusion of RNAase (1 μg) or of tetrahydrouridine (THU; 20 nmoles, 2nmoles, or 200 pmoles) does not inhibit the activity of APOBEC1.

E, Deaminating activity is specific for a single-stranded substrate. Theassay was performed using 3′-fluorscein-labelled oligonucleotide SPM168in the presence of the indicated ratio of either oligonucleotide SPM171(which is complementary to SPM168) or SPM201 (which is not).

F, Comparison of 3′-fluorescein labelled oligonucleotides SPM168 (leftthree lanes) and SPM163 (right three lanes) as targets for deaminationby 0.5, 1 and 2 μl of APOBEC1.

FIG. 15 Autoradiographs showing hybridisation of APOBEC1, APOBEC3G, andubiquitin (control) probes to matched pairs of tumour (T) andcorresponding normal (N) cDNA samples derived from a variety tissuesusing a cancer profiling array (Clontech).

DETAILED DESCRIPTION OF THE INVENTION

The fact that AID, a homologue of Apobec-1 (which deaminates C in RNA),is required for all three programmes of diversification of rearrangedimmunoglobulin genes (Muramatsu M. et al. (2000); Revy, P. et al.(2000); Arakawa, H. et al. (2002); Harris, R. S. et al. (2002) andMartin, A. et al. (2002)) and that the initiation of all threeprogrammes could be explained by DNA modification at dG/dC (Martin etal. (2002), Maizels et al. (1995), Weill et al. (1996); Sale et al.(2001), Ehrenstein et al. (1999) Rada et al. (1998) and Wiesendanger etal. (2000)) led the present inventors to the model presented in FIG. 1.The hypothesis set out herein is that AID mediates the deamination of asmall number of C residues within the Ig loci. Conventionally, thiswould trigger base excision repair (Lindahl T. (2000)) with uracil beingremoved by uracil-DNA glycosylase (UDG) and, following cleavage at theabasic site by an apyrimidic endonuclease (APE), a dC residue would bereinserted by a DNA polymerase/deoxyribophosphodiesterase. If, insteadof being repaired, the DNA strand harbouring the dU residue were used totemplate DNA synthesis, then the consequence would be a dC→dT (anddG→dA) transition. Alternatively, if DNA synthesis occurred over theabasic site, both transitions and transversions would be generatedalthough a transition bias might still be observed if the polymeraseused for the lesion bypass preferentially inserted dA residues. Thus,the stage at which polymerase bypass of the original lesion occurred aswell as the preferences of the polymerase used would affect thetransition bias of the hypermutation. This could account for theotherwise puzzling observation that whereas mutation in mouse and man aswell as in the hypermutating Ramos B cell line exhibits a markedtransition preference (Sale et al. (1998)), no such preference isevident in the mutations exhibited by the XRCC2-deficient chicken DT40 Bcell line (Sale et al. (2001)).

Templated repair of the deamination-induced lesion by a V pseudogenewould lead to gene conversion; such repair would be dependent on theRAD51 paralogues XRCC2, XRCC3 and RAD51B (Sale et al. (2001)). Thesecond phase of mutation (yielding mutations at dA/dT) which is observedin vivo in man and mouse would be triggered by MSH2/MSH6 recognition ofthe dU/dG mismatch itself or of some intermediate in its correction(Rada et al. (1998); Wiesendanger et al. (2000)), and would presumablyoccur by some form of patch repair. Repair partnered on another switchregion could lead to switch recombination. For switching, where there isan indication for a role of non-homologous end-joining (Manis et al.(2002); Peterson et al. (2001)), one might imagine that deamination ofproximal dCs on opposite strands could generate the staggered DNA breaksproposed by Chen et al (2001).

A central prediction of this model is that AID has the ability totrigger dC→dU deamination in DNA. Such an activity would presumably belargely restricted to its physiological target (the Ig loci) since arampant DNA deaminase activity would likely be harmful to the cell.

The results presented herein suggest that, whereas functional Ig genesare generated by RAG-mediated rearrangement, subsequent diversificationis triggered by AID-mediated deamination of dC residues within theimmunoglobulin locus with the outcome (gene conversion, switchrecombination or mutation phases 1/2) dependent upon the way in whichthe initiating dU/dG lesion is resolved.

As well as AID, the APOBEC/AID family contains several members that arecapable of mutating DNA, triggering nucleotide substitutions at dC/dG bya process which, given its sensitivity to uracil-DNA glycosylase, islikely to be dC deamination.

The physiological functions of the other APOBEC family members areunknown. Whereas APOBEC1 shows relatively restricted tissuedistribution, APOBEC3G is much more widely expressed. Hybridisationexperiments suggest that some APOBEC family members are well expressedin a variety of cancers (FIG. 8) and cancer cell lines.

Quite apart, however, from the normal physiological functions of theAPOBEC family members, the fact that several of the members can displaya DNA mutator activity (taken together with the observation thattransgenic expression of APOBEC1 is oncogenic in mice) raises thepossibility that they might contribute to the ‘spontaneous’ dCdeamination that occurs in normal cells as well as the elevated mutationrates proposed to be associated with many human cancers. Indeed, in thelarge database of p53 mutations in human cancers (where nearly 13,000single base changes have been identified scattered over a large numberof positions in the gene) over 50% of the substitution mutations (andover 60% of the silent mutations) are nucleotide transitions at dC/dGwith roughly half of these dC/dGs being at dCpdG dinucleotides.

Measuring an Enhanced Mutation Rate in Cells as an Indication of aMutator Phenotype

Hypermutating cells or cells having a mutator phenotype may beidentified by a variety of techniques, including sequencing of targetsequences, selection for expression loss mutants, assay using bacterialMutS protein and selection for change in gene product activity. Methodsfor measuring mutation rates include fluctuation analysis (described,for example, by Luria and Delbreck (1943) and Capizzi and Jameson(1973)). In this, the generation of clones showing resistance to aselection media. Suitable selection media for prokaryotic cells includerifampicin, nalidixic acid, valine and fucose. Cells selected accordingto this procedure are cells in which mutation has occurred in a gene orgenes which enable the effect of the selection media to be overcome.Other ways of determining mutation rates include direct sequencing ofspecific portions of DNA or indirect methods such as the MutS assay(Jolly et al., 1997 Nucleic Acids Research 25, 1913-1919) or monitoringthe generation of immunoglobulin loss variants.

In a preferred embodiment of the invention, the method involvesgenerating mutations in a target nucleic acid which encodes animmunoglobulin. Immunoglobulin loss may be detected both for cells whichsecrete immunoglobulins into the culture medium, and for cells in whichthe immunoglobulin is displayed on the cell surface. Where theimmunoglobulin is present on the cell surface, its absence may beidentified for individual cells, for example by FACS analysis,immunofluorescence microscopy or ligand immobilisation to a support. Ina preferred embodiment, cells may be mixed with antigen-coated magneticbeads which, when sedimented, will remove from the cell suspension allcells having an immunoglobulin of the desired specificity displayed onthe surface.

The technique may be extended to any immunoglobulin molecule, includingantibodies, T-cell receptors and the like. The selection ofimmunoglobulin molecules will depend on the nature of the clonalpopulation of cells which it is desired to assay according to theinvention.

Alternatively, mutations in cells according to the invention may beidentified by sequencing of target nucleic acids, such as V genes, anddetection of mutations by sequence comparison. This process may beautomated in order to increase throughput.

In a further embodiment, cells which hypermutate V genes may be detectedby assessing change in antigen binding activity in the immunoglobulinsproduced in a clonal cell population. For example, the quantity ofantigen bound by a specific unit amount of cell medium or extract may beassessed in order to determine the proportion of immunoglobulin producedby the cell which retains a specified binding activity. As the V genesare mutated, so binding activity will be varied and the proportion ofproduced immunoglobulin which binds a specified antigen will be reduced.

Alternatively, cells may be assessed in a similar manner for the abilityto develop a novel binding affinity, such as by exposing them to anantigen or mixture of antigens which are initially not bound andobserving whether a binding affinity develops as the result ofhypermutation.

In a further embodiment, the bacterial MutS assay may be used to detectsequence variation in target nucleic acids. The MutS protein binds tomismatches in nucleic acid hybrids. By creating heteroduplexes betweenparental nucleic acids and those of potentially mutated progeny, theextent of mismatch formation, and thus the extent of nucleic acidmutation, can be assessed.

Where the target nucleic acid encodes an gene product other than animmunoglobulin, selection may be performed by screening for loss oralteration of a function other than binding. For example, the loss oralteration of an enzymatic activity may be screened for.

Genetic Manipulation of Cells

Cells modified to express AD or its homologues are cells in which AIDprotein expression (or AID homologue protein expression) has beeninduced by means, for example, of transfecting host cells with a vectorencoding AID protein. Such transfection may be stable or transienttransfection.

“Vector” refers to any agent such as a plasmid, cosmid, virus,autonomously replicating sequence, phage, or linear single-stranded,circular single-stranded, linear double-stranded, or circulardouble-stranded DNA or RNA nucleotide sequence that carries exogenousDNA into a host cell or organism. The recombinant vector may be derivedfrom any source. In the context of the present invention, the vector isfor stable expression of AID and is, therefore, capable of genomicintegration or autonomous replication but maintained throughout divisioncycles of the host cell.

An expression vector includes any vector capable of expressing a codingsequence encoding a desired gene product that is operatively linked withregulatory sequences, such as promoter regions, that are capable ofexpression of such DNAs. Thus, an expression vector refers to arecombinant DNA or RNA construct, such as a plasmid, a phage,recombinant virus or other vector, that upon introduction into anappropriate host cell, results in expression of the cloned DNA.Appropriate expression vectors are well known to those with ordinaryskill in the art and include those that are replicable in eukaryoticand/or prokaryotic cells and those that remain episomal or those whichintegrate into the host cell genome. For example, DNAs encoding aheterologous coding sequence may be inserted into a vector suitable forexpression of cDNAs in mammalian cells, e.g. a CMV enhancer-based vectorsuch as pEVRF (Matthias, et al., 1989).

Construction of vectors according to the invention employs conventionalligation techniques. Isolated plasmids or DNA fragments are cleaved,tailored, and religated in the form desired to generate the plasmidsrequired. If desired, analysis to confirm correct sequences in theconstructed plasmids is performed in a known fashion. Suitable methodsfor constructing expression vectors, preparing in vitro transcripts,introducing DNA into host cells, and performing analyses for assessinggene product expression and function are known to those skilled in theart. Gene presence, amplification and/or expression may be measured in asample directly, for example, by conventional Southern blotting,Northern blotting to quantitate the transcription of mRNA, dot blotting(DNA or RNA analysis), or in situ hybridisation, using an appropriatelylabelled probe which may be based on a sequence provided herein. Thoseskilled in the art will readily envisage how these methods may bemodified, if desired.

Vector-driven protein expression can be constitutive or inducible.Inducible vectors include either naturally inducible promoters, such asthe trc promoter, which is regulated by the lac operon, the IPTGpromoter which is inducible by IPTG and the pL promoter, which isregulated by tryptophan, the MMTV-LTR promoter, which is inducible bydexamethasone, or can contain synthetic promoters and/or additionalelements that confer inducible control on adjacent promoters. Otherpromoters include E. coli promoters such as PhoB.

Methods for introducing the vectors and nucleic acids into host cellsare well known in the art; the choice of technique will depend primarilyupon the specific vector to be introduced and the host cell chosen.Plasmid vectors will typically be introduced into chemically competentor electrocompetent bacterial cells. Vectors can be introduced intoyeast cells by spheroplasting, treatment with lithium salts,electroporation, or protoplast fusion. Mammalian and insect cells can bedirectly infected by packaged viral vectors, or transfected by chemicalor electrical means.

Methods for Generating Fusion Proteins

AID or any of its homologues or derivatives, including Apobec-1, may begenerated as fusion proteins comprising the AID protein or a portionthat retains its mutator activity coupled to a DNA binding domain or onehalf of a specific binding pair. Preferably the fusion protein will nothinder the mutator activity of the protein sequence. Methods forgenerating fusion proteins will be familiar to those skilled in the artand include generation of expression vectors comprising the AID nucleicacid sequence linked or ligated to the nucleic acid sequence encoding aDNA binding domain.

Methods for Preparing and Selecting Immunoglobulins or Other SurfaceExpressed Proteins.

The process of hypermutation is employed, in nature, to generateimproved or novel binding specificities in immunoglobulin molecules.Thus, by selecting cells according to the invention which produceimmunoglobulins capable of binding to the desired antigen and thenpropagating these cells in order to allow the generation of furthermutants, cells which express immunoglobulins having improved binding tothe desired antigen may be isolated.

A variety of selection procedures may be applied for the isolation ofmutants having a desired specificity. These include FluorescenceActivated Cell Sorting (FACS), cell separation using magnetic particles,antigen chromatography methods and other cell separation techniques suchas use of polystyrene beads.

Separating cells using magnetic capture may be accomplished byconjugating the antigen of interest to magnetic particles or beads. Forexample, the antigen may be conjugated to superparamagnetic iron-dextranparticles or beads as supplied by Miltenyi Biotec GmbH. These conjugatedparticles or beads are then mixed with a cell population which mayexpress a diversity of surface immunoglobulins. If a particular cellexpresses an immunoglobulin capable of binding the antigen, it willbecome complexed with the magnetic beads by virtue of this interaction.A magnetic field is then applied to the suspension which immobilises themagnetic particles, and retains any cells which are associated with themvia the covalently linked antigen. Unbound cells which do not becomelinked to the beads are then washed away, leaving a population of cellswhich is isolated purely on its ability to bind the antigen of interest.Reagents and kits are available from various sources for performing suchone-step isolations, and include Dynal Beads (Dynal AS;http://www.dynal.no), MACS-Magnetic Cell Sorting (Miltenyi Biotec GmbH;http://www.miltenyibiotec.com), CliniMACS (AmCell;http://www.amcell.com) as well as Biomag, Amerlex-M beads and others.Similar techniques can be used for non-immunoglobulin surface expressedmolecules where selection for their surface expression can be throughrecognition by a specific binding partner.

Fluorescence Activated Cell Sorting (FACS) can be used to isolate cellson the basis of their differing surface molecules, for example surfacedisplayed immunoglobulins. Cells in the sample or population to besorted are stained with specific fluorescent reagents which bind to thecell surface molecules. These reagents would be the antigen(s) ofinterest linked (either directly or indirectly) to fluorescent markerssuch as fluorescein, Texas Red, malachite green, green fluorescentprotein (GFP), or any other fluorophore known to those skilled in theart. The cell population is then introduced into the vibrating flowchamber of the FACS machine. The cell stream passing out of the chamberis encased in a sheath of buffer fluid such as PBS (Phosphate BufferedSaline). The stream is illuminated by laser light and each cell ismeasured for fluorescence, indicating binding of the fluorescentlabelled antigen. The vibration in the cell stream causes it to break upinto droplets, which carry a small electrical charge. These droplets canbe steered by electric deflection plates under computer control tocollect different cell populations according to their affinity for thefluorescent labelled antigen. In this manner, cell populations whichexhibit different affinities for the antigen(s) of interest can beeasily separated from those cells which do not bind the antigen. FACSmachines and reagents for use in FACS are widely available from sourcesworld-wide such as Becton-Dickinson, or from service providers such asArizona Research Laboratories (http://www.arl.arizona.edu/facs/).

Another method which can be used to separate populations of cellsaccording to the affinity of their cell surface protein(s) for aparticular antigen is affinity chromatography. In this method, asuitable resin (for example CL-600 Sepharose, Pharmacia Inc.) iscovalently linked to the appropriate antigen. This resin is packed intoa column, and the mixed population of cells is passed over the column.After a suitable period of incubation (for example 20 minutes), unboundcells are washed away using (for example) PBS buffer. This leaves onlythat subset of cells expressing immunoglobulins which bound theantigen(s) of interest, and these cells are then eluted from the columnusing (for example) an excess of the antigen of interest, or byenzymatically or chemically cleaving the antigen from the resin. Thismay be done using a specific protease such as factor X, thrombin, orother specific protease known to those skilled in the art to cleave theantigen from the column via an appropriate cleavage site which haspreviously been incorporated into the antigen-resin complex.Alternatively, a non-specific protease, for example trypsin, may beemployed to remove the antigen from the resin, thereby releasing thatpopulation of cells which exhibited affinity for the antigen ofinterest.

Insertion of Heterologous Transcription Units

In order to maximise the chances of quickly selecting an antibodyvariant capable of binding to any given antigen, or to exploit theAID-dependant hypermutation system for non-immunoglobulin genes, anumber of techniques may be employed to engineer cells according to theinvention such that their hypermutating abilities may be exploited.

In a first embodiment, transgenes are transfected into a cell accordingto the invention such that the transgenes become targets for thedirected hypermutation events.

As used herein, a “transgene” is a nucleic acid molecule which isinserted into a cell, such as by transfection or transduction. Forexample, a “transgene” may comprise a heterologous transcription unit asreferred to above, which may be inserted into the genome of a cell at adesired location. The “transgene” may be the nucleic acid encoding thegene product of interest.

The plasmids used for delivering the transgene to the cells are ofconventional construction and comprise a coding sequence, encoding thedesired gene product, under the control of a promoter. Genetranscription from vectors in cells according to the invention may becontrolled by promoters derived from the genomes of viruses such aspolyoma virus, adenovirus, fowlpox virus, bovine papilloma virus, aviansarcoma virus, cytomegalovirus (CMV), a retrovirus and Simian Virus 40(SV40), from heterologous mammalian promoters such as the actin promoteror a very strong promoter, e.g. a ribosomal protein promoter, and fromthe promoter normally associated with the heterologous coding sequence,provided such promoters are compatible with the host system of theinvention.

Transcription of a heterologous coding sequence by cells according tothe invention may be increased by inserting an enhancer sequence intothe vector. Enhancers are relatively orientation and positionindependent. Many enhancer sequences are known from mammalian genes(e.g. elastase and globin). However, typically one will employ anenhancer from a eukaryotic cell virus. Examples include the SV40enhancer on the late side of the replication origin (bp 100-270) and theCMV early promoter enhancer. The enhancer may be spliced into the vectorat a position 5′ or 3′ to the coding sequence, but is preferably locatedat a site 5′ from the promoter.

Advantageously, a eukaryotic expression vector may comprise a locuscontrol region (LCR). LCRs are capable of directing high-levelintegration site independent expression of transgenes integrated intohost cell chromatin, which is of importance especially where theheterologous coding sequence is to be expressed in the context of apermanently-transfected eukaryotic cell line in which chromosomalintegration of the vector has occurred, in vectors designed for genetherapy applications or in transgenic animals.

Eukaryotic expression vectors will also contain sequences necessary forthe termination of transcription and for stabilising the mRNA. Suchsequences are commonly available from the 5′ and 3′ untranslated regionsof eukaryotic or viral DNAs or cDNAs. These regions contain nucleotidesegments transcribed as polyadenylated fragments in the untranslatedportion of the mRNA.

Transgenes according to the invention may also comprise sequences whichdirect hypermutation. Such sequences have been characterised, andinclude those sequences set forth in Klix et al., (1998; Eur. J.Immunol. 28:317-326), and Sharpe et al., (1991; EMBO J. 10:2139-2145),incorporated herein by reference. Thus, an entire locus capable ofexpressing a gene product and directing hypermutation to thetranscription unit encoding the gene product is transferred into thecells. The transcription unit and the sequences which directhypermutation are thus exogenous to the cell. However, althoughexogenous the sequences which direct hypermutation themselves may besimilar or identical to the sequences which direct hypermutationnaturally found in the cell

The endogenous V gene(s) or segments thereof may be replaced withheterologous V gene(s) by homologous recombination, or by gene targetingusing, for example, a Lox/Cre system or an analogous technology or byinsertion into hypermutating cell lines which have spontaneously deletedendogenous V genes. Alternatively, V region gene(s) may be replaced byexploiting the observation that hypermutation is accompanied by doublestranded breaks in the vicinity of rearranged V genes.

Furthermore, enhanced targeting of mutation can be achieved by inducingconvergent promoters upstream and downstream of the desired gene andtherefore inducing transcription in both directions. Deamination of dCin vitro by APOBEC-1 has be demonstrated to be dependent on thesingle-strandedness of the substrate oligonucleotide as describedherein. The increase in availability of single-stranded DNA can beinduced by convergent transcription or by a combination of transcriptionand DNA bending caused by promoter activation. Suitable types ofpromoter include the PhoB promoter. Other Prokaryotic promoters includeActivators (e.g. AraBAD, PhoA), Repressors (e.g. Tet, Lac, Trp, HybridLac/Trp such as Tac, pL, Regulatable hybrids of pL such as pL-tet) andViral Polymerase (e.g. T7). Suitable Eukaryotic promoters includepromoters recognised by RNA Polymerase I (e.g. 45S rDNA) RNA PolymeraseII (e.g. Gal4, β-Actin or Viral, such as CMV-IE and Artificial,especially Tet-on, Tet-off) RNA Polymerase III)(e.g. H1 RNA, U6 snRNA).

DNA Binding Domain and Specific DNA Recognition Sequences

Transcription factors bind DNA by recognising specific target sequencesgenerally located in enhancers, promoters, or other regulatory elementsthat affect a particular target gene. The target sequences for a numberof transcription factors are well known to those skilled in the art.Transcription factors having specific DNA target or recognitionsequences include the yeast transcription factors such as GAL4,bacterial proteins such as the repressor protein Lex A and mammaliantranscription factors such as estrogen receptor.

The DNA binding domain within such proteins serves to bind the proteinto the target sequence or “DNA binding protein recognition sequence” andtherefore bring the protein to a set location within a DNA sequence.

One particular type of transcription factor binding site is named a“response element” which is a particular DNA sequence which causes agene to respond to a regulatory transcription factor. Examples includethe heat shock response element (HRE) and the glucocorticoid responseelement (GRE). A number of hormone response elements are also known tothose skilled in the art. Response elements contain short consensussequences which are the target or recognition for the DNA bindingdomains found within the corresponding inducible transcription factorssuch that, for example, transcription factors induced by a heat shockresponse bind HREs, glucocorticoid-induced factors bind GREs etc. Otherexamples include the binding of estrogen receptor via a DNA bindingdomain to the specific DNA binding protein recognition sequence calledthe ERD or estrogen response domain. The interaction of transcriptionfactors and response elements are described, for example in Genes VI,Lewin, Oxford University Press, 1997. Comparisons between the sequencesof many transcription factors suggest that common types of motif can befound that are responsible for binding to DNA. Such motifs include thezinc finger motif, the helix-turn-helix or the helix-loop-helix. Othersuch motifs are known to the person skilled in the art.

The interaction between a DNA binding domain and a DNA binding proteinrecognition sequence can be used to direct mutation to a specificnucleic acid sequence. One way of directing mutation in this way isdescribed as follows: an expression construct for expressing a fusionprotein comprising Apobec with the estrogen receptor DNA binding domain(ERD) (Schwabe et al. Cell. 1993 Nov. 5; 75(3):567-78) is constructed asdescribed below. The expression construct is expressed in yeast/E. Coliusing standard transfection procedures. The yeast/E. Coli host cell isalso engineered such that the desired target gene is also linked to ashort ERD recognition sequence (Schwabe et al., 1993).

Screening for Modulators of AID Activity

Compounds having inhibitory, activating, or modulating activity can beidentified using in vitro assays for activity and/or expression of AIDor its homologues including APOBEC1, APOBEC 2, APOBEC 3c and APOBEC 3G,e.g., ligands, agonists, antagonists, and their homologs and mimetics.

Modulator screening may be performed by adding a putative modulator testcompound to a cell expressing AID (or its homologues) in accordance withthe invention, and monitoring the effect of the test compound on thefunction and/or expression of AID. A parallel sample which does notreceive the test compound is also monitored as a control. The treatedand untreated cells are then compared by any suitable phenotypiccriteria, and in particular by comparing the mutator phenotype of thetreated and untreated cells using methods as described herein.

The invention is further described below, for the purposes ofillustration only, in the following examples.

EXAMPLES Example 1

A plasmid containing a human AID cDNA expressed under control of the lacpromoter was transformed into E. coli strain KL16 and its effect on thefrequency of mutation to rifampicin-resistance (Rif^(R)) measured byfluctuation analysis (FIG. 2 and Table 1).

The AID expression plasmid was generated by cloning the human AID cDNA(Harris et al. (2002)) on an NcoI-HindIII fragment into pTrc99A(Pharmacia; gift of R. Savva). E. coli strains KLI6 (Hfr (PO-45) relAlspoT1 thi-1) and its ung-1 derivative (BW310) as well as AB1157 and itsnfi-1::cat derivative (BW1161) were from B. Weiss; GM1003 (dcm-6 thr-1hisG4 leuB6 rpsL ara-14 supE44 lacY1 tonA31 tsx-78 galK2 galE2 xyl-5thi-1 mtl-1 mug::mini-Tn10) derivatives carrying ung-1 and/ormug::mini-Tn10 mutations were from A. Bhagwat.

APOBEC1 and APOBEC2 expression constructs were generated by subcloningthe rat APOBEC1 cDNA (Bam-HI-SalI fragment of pSB202²⁷ gift from N.Navaratman and J. Scott) or the human APOBEC2 cDNA (NcoI-BsaA1 fragmentfrom IMAGE clones 341062). APOBEC3C was amplified from the Ramos humanBurkitt lymphoma cell line cDNA using oligonucleotides 5′NNNGAATTCAAGGCTGAACATGAATCCACAG (SEQ ID NO: 1) and 5′NNNNNGTCGACGGAGACCCCTCACTGGAGA (SEQ ID NO: 2). APOBEC3G was amplifiedfrom IMAGE clone 1284557 using oligonucleotides5′-NNNGAATTCAAGGATGAAGCCTCACTTCAGA (SEQ ID NO: 3) and 5′NNGACTGCAGOCCATCCTTCAGTTTTCCTG (SEQ ID NO: 4).

The E63A, W90S and C93A substitutions in APOBEC1 were introduced bysite-directed mutagenesis using the following oligonucleotide pairs:

5′ACCAACAAACACGTTGcAGTCAATTTCATAGAAA (SEQ ID NO: 5)/TTTCTATGAAATTGACTgCAACGTGTTT GTTGGT (SEQ ID NO: 6),5′ACCTGGTTCCTGTCCTcGAGTCCCTGTGGGGAG (SEQ ID NO: 7)/CTCCCCACAGGGACTCgAGGACAGGAACC AGGT (SEQ ID NO: 8), andCTGTCCTGGAGTCCCgcTGGGGAGTGCTCCAGG (SEQ ID NO: 9)

/CCTGGAGCACTCCCCAgcGGGACTCCAGGAC AG (SEQ ID NO: 10) (substitutions inlower case).

All constructs were verified by DNA sequencing and were identical topublished sequences (Madsen et al. (1999), J. Invest. Dermatol. 113,162-169; Jarmuz et al, Anant et al. (2001), Am. J. Physiol. CellPhysiol., 281, C1904-1916) or to existing GenBank entries (APOBEC1:NM_(—)012907.1; APOBEC2: NM_(—)006789.1).

The plasmids were transformed into E. coli strain KL16 and their effecton the frequency of mutation to rifampicin-resistance (Rif^(R)) measuredby fluctuation analysis (Tables 2 and 3).

Mutation Assays

Mutation frequencies were measured by determining the median number ofcolony-forming cells surviving selection per 10⁹ viable cells plated.Each median was determined from 8-16 independent cultures grownovernight to saturation in rich medium supplemented with 100 μg/mLcarbenicillin and 1 mM IPTG (unless indicated otherwise). Rif^(R) andNal^(R) colonies were selected on rich medium containing 100 μg/mlrifampicin and 40 μg/ml nalidixic acid respectively. Valine- andfucose-resistant mutants were selected on minimal M9 medium containing0.2% glucose/40 μg/ml L-Valine and 0.1% L-arabinose/0.2% D-fucoserespectively.

In multiple experiments performed in the presence of the transcriptionalinducer IPTG, the AID-transformed cells generated Rif^(R) colonies at afrequency some 4-8 fold higher than vector-transformed controls. Thisstimulation was evident in different genetic backgrounds (KL16, GM1003and AB1157), was dependent upon AID (monitored±IPTG) and was notpeculiar to the selection applied, being also clear when mutation tonalidixic acid (Nal)-, valine- or fucose-resistance was monitored (FIG.2 and Table 1). The variation in the mutation enhancement observed inthe different selections could reflect differences in the types andabundances of mutations that confer resistance.

In similar multiple experiments, cells transformed with Apobec-1generated Rif^(R) colonies at a much higher frequency (several hundredfold) than vector-transformed controls. Cells transformed with otherApobec homologues, Apobec3C and Apobec3G, also showed an increasedfrequency (10-20 fold) of mutation to Rif^(R) compared to thevector-transformed controls.

For the experiments shown in FIG. 5 and Table 3, all measurements wereperformed using KL16 or its ung-1 derivative BW310 transformed withvector alone or an expression construct as indicated. Mutationfrequencies were measured by determining the median number ofcolony-forming cells surviving selection per 10⁹ viable cells plated.Each median presented in FIG. 5 and Table 3 was determined from 12-16independent cultures grown overnight to saturation in rich mediumsupplemented with 100 μg/mL carbenicillin and 1 mM IPTG (with theexception of control experiments in which the inducer IPTG was omitted,FIG. 5 b). IPTG-induced expression of APOBEC1 or its homologuesconferred no obvious defect in cell growth or viability (e.g. APOBEC1,FIG. 5 c). Rif^(R) mutants were selected on rich medium containing 100μg/ml rifampicin and sequenced. Only about 1% of the Rif^(R) coloniesfailed to contain mutations in the region of rpoB sequenced [nucleotides1525-1722, numbering from the initiating ATG; GenBank AE000472].

The nature of the Rif^(R) and Nal^(R) mutants was determined by directlyamplifying and sequencing the relevant section of the rpoB [627 bp PCRproduct amplified using 5′-TTGGCGAAATGGCGGAAAACC (SEQ ID NO: 11) and5′-CACCGACGGATACCACCTGCTG (SEQ ID NO: 12)] or gyrA [521 bp PCR productamplified using oligonucleotides 5′-GCGCGGCTGTGTTATAATTT (SEQ ID NO: 13)and 5′ TTCCGTGCCGTCATAGTTATC (SEQ ID NO: 14)].

If the AID-mediated enhancement in mutation frequency is due to astimulation of dC deamination, the pattern of mutation to Rif^(R) shouldshow a shift toward dC→dT and dG→dA transitions. Sequence of the rpoBgene in multiple independent Rif^(R) colonies, revealed that this isindeed the case. Such transitions account for 79% of the mutationsscored in the AID-transformed cells but only for 31% of the mutations inthe vector transformed controls (FIG. 3 a, b). Given the extent ofmutation stimulation by AID, the data are consistent with the entireAID-mediated enhancement being due to transitions at dG/dC. A similarconclusion was obtained by examining the spectrum of gyrA mutations inthe Nal^(R) colonies—despite the fact that the selected mutationsappeared restricted to essentially three nucleotide positions. Thus,whereas 34% of the gyrA mutations amongst the control transformants arenucleotide transitions at dG/dC, the percentage increases to 71% in theAID transformants (FIG. 4)

It is notable that there is a striking difference in mutationdistribution between the AID transformants and controls. Analysis of therpoB mutations amongst the vector-transformed control cells reveals thatdC→dT transitions at positions Ser512, Ser522, His526, Ser531, Pro564and Ser574 can all confer Rif^(R). However, transitions at only some ofthese positions (His526, Ser531 and Ser574) are enhanced amongst the AIDtransformants whereas other positions (Ser522 and Pro564) show littlesign of increased mutation. Even more striking is the fact that a commondG→dA transition in the AID transformants (Arg529) is not seen at all inthe controls (FIG. 3). Similar evidence of specific targeting comes fromgyrA. Whereas dC→dT transitions at Ser83 and dG→dA transitions at Asp87can both confer Nal^(R), it is the C→T transitions at Ser83 that areselectively enhanced by AID (FIG. 4). Despite this strong evidence thatAID-dependent mutation is non-random, presumably depending upon localsequence environment, we cannot discriminate on these datasets whetherthis sequence preference reflects a hotspot preference similar to thatof the dG/dC-biased phase antibody hypermutation (Rada et al. (1998))since there are only a limited number of base substitutions that canyield the selected phenotypes.

If AID-induced mutations in the E. coli transformants are indeedoccurring through deamination of dC, an enhancement of the effect wouldbe expected in cells lacking uracil-DNA glycosylase (UDG). This isindeed the case. Although both UDG deficiency and ectopic expression ofAID are sufficient in themselves to yield a mutator phenotype, AIDexpression in an ung background yielded a mutation frequency that wasmuch greater than the sum of their independent mutation frequencies(FIG. 2 and Table 1). A similar effect was seen in E. coli expressingApobec-1 (see Table 2).

In E. coli ung-1 mutants, some back-up uracil DNA-glycosylase activitymay be provided by the product of the mug gene (Sung et al. (2001) andMokkapati et al. (2001)). It is found that whilst the AID mutator effectis not significantly higher in a mug⁻ than mug⁺ background, the mugmutation allows at most a slightly augmented AID-mutator effect whencombined with ung-1 (Table 1). If AID were to act by deaminating dGrather than dC, an increased mutation frequency in a backgrounddeficient in endonuclease V (encoded by nfi) might be anticipated sincethis enzyme is implicated in the repair of deoxyxanthosine^(21,22). Thisdoes not occur; the mutation frequency displayed by AID-transformednfi-1 cells approximates the sum of the frequencies that areindependently attributable to AID and nfi-1 (Table 1).

The data strongly suggest that AID mediates the deamination of dCresidues in the DNA. The homology of AID to Apobec-1 and cytidinedeaminases (Muramatsu et al (1999)) obviously argues in favour of aclose involvement of AID in the DNA deamination process itself. Thepreferential targeting of mutation to the immunoglobulin loci inlymphocytes presumably depends on proteins with which AID associates.Given that the cis-regulation of both switch recombination andhypermutation is linked to the transcription regulatory elements (Maniset g. (2002); Betz et al (1994)), it would appear likely that AID isrecruited either directly or indirectly by transcription- orchromatin-associated factors.

APOBEC1-transformed bacteria grown in the presence of thetranscriptional inducer IPTG displayed massively elevated frequencies ofRif^(R) mutation (FIG. 5 a). This enhancement was confirmed byfluctuation analyses of Rif^(R) mutants observed in three independentexperiments (Table 3 top). In comparison to vector-transformed cells,the median enhancement by APOBEC1 ranged from 440-, to 700-fold (mean of530), whereas that attributable to AID ranged from 3.8 to 13-fold (meanof 7.8 in agreement with data above). The observed increases were due toAPOBEC1 since experiments performed in the absence of the inducer IPTGresulted in a significantly diminished effect (FIG. 5 b). Furthermore,single amino acid changes E63A, W90S and C93A (which are located at orclose to the proposed Zn²⁺-coordination domain at the active site ofAPOBEC1 (Navaratnam et al) abolished the enhancement (FIG. 5 c). Thestimulation was not specific to the selection or the locus (Rif^(R)mutations map largely to the rpoB gene) since it was also clear whenresistance to nalidixic acid was selected (due mostly to mutations inthe gyrA gene) (Table 3 (top)). Mutation frequencies at gyrA weresignificantly lower than at rpoB; this likely reflects restrictions innumbers of base substitutions that each locus permits (both genes areessential and fewer sites appear mutable in gyrA). It is notable thatwhilst APOBEC1 yields a mutator phenotype in these assays as strong asthat achieved with some of the most potent E. coli mutators (e.g.mismatch repair-defective strains (Schapper (1993), J. Bio. Chem. 268,23762-23765)), the increased mutation load due to APOBEC1 expressioncaused no obvious defects in cell growth or viability (FIG. 5 d). Thismight reflect the nature of the lesions introduced by APOBEC1expression.

If, like AID, the observed stimulation of mutation is due to increaseddC deamination, then this should be apparent in the spectrum of Rif^(R)mutations—a bias toward dC/dG→dT/dA transition mutations would result.This was confirmed by sequencing rpoB gene PCR products from purifiedRif^(R) colonies selected from APOBEC1-transformed cultures (as well asfrom AID-transformed and vector-transformed controls). By comparisonwith vector-transformed controls, APOBEC1-transformed cells showed adramatic shift in mutation spectrum, from 27% (32/120 mutations) to 100%(136/136 mutations) transition mutations at dC/dG (FIG. 2). Consistentwith the results presented above, AID-transformed E. coli gave asomewhat less dramatic shift to 82% (102/124) transitions at dC/dG (FIG.7) reflecting the fact that AID, at least in this system, is a lesspotent mutator than APOBEC1.

The mutation spectra revealed striking local differences betweenvector-, APOBEC1- and AID-transformed cells with respect to the specificdC/dG pairs targeted. Whereas, in keeping with AID, the majority ofdC/dG to dT/dA transitions in rpoB in AID-transformed cells clustered atC1576 (45/124 mutations) and G1586 (23/124 mutations), those inAPOBEC1-transformed cells showed a quite distinct distribution withmajor hotspots at C1535 (39/136 mutations) and C1592 (74/136 mutations)(FIG. 6 a and FIG. 7). Thus, the entire enhancement of Rif^(R) mutationfrequency observed in APOBEC1-transformed cells occurs via transitionsat dC/dG base pairs but with the local targeting specificity beingremarkably different from that of AID.

The different local targeting specificities of APOBEC1 and AID stronglyargues that both proteins are involved in a dC deamination process,generating dU/dG lesions in DNA. Given this likely mode of action, onewould expect that the stimulation of mutation by APOBEC1 (like that byAID) would be enhanced in cells lacking uracil-DNA glycosylase (UDG), anenzyme that specifically recognises dU in DNA and initiates baseexcision repair of dU/dG lesions (Lindahl). UDG-deficiency (ung-1) andAPOBEC1 expression by themselves enhance mutation about 10- and500-fold, respectively. APOBEC1 expression in UDG-deficient cellsfurther increases levels of mutation to about 2600-fold abovevector-transformed ung⁺ cells, a much more than additive effectdemonstrating that APOBEC1 is capable of triggering dU/dG lesions (Table3 top). Despite the additional mutation load in ung-1 cells, sequenceanalysis of the mutations conferring Rif^(R) revealed that, as for AID,the mutational targeting by APOBEC1 in an ung-1 background wasessentially the same as in ung⁺ cells (data not shown).

At least six other APOBEC1-like proteins exist in humans (Madsen et al;Jarmuz et al; Anant et al). APOBEC2 (also called APOBEC1-relatedcytidine deaminase-1, ARCD-1) is found on chromosome 6p21.1 and theothers, termed APOBEC3A through APOBEC3G (also termed phorbolins orARCDs) are encoded on chromosome 22q12-q13. They all contain a regionhomologous to the putative Zn²⁺-binding cytidine deaminase motif ofAPOBEC1. This suggested that the mutator activities of these proteinsmight also be conserved and prompted us to ask whether these homologuesmight also work on DNA. Expression of APOBEC3C and APOBEC3G,representative members of the chromosome 22 cluster (FIG. 7 a), but notAPOBEC2, triggered increases in the frequencies of mutation to Rif^(R)and Nal^(R) in E. coli (Table 3 bottom). The stimulation of mutation byAPOBEC3G is significantly greater when monitored by the frequency ofRif^(R) rather than Nal^(R) clones (Table 3). This may reflect therelatively strong target preference of APOBEC3G (see below) takentogether with the fact that there are many more dC/dG targets in rpoBthan in gyrA that can confer resistance to the relevant antibiotic. Themutation frequencies to Rif^(R) achieved with both APOBEC3C and APOBEC3Gwas also further elevated in an ung-1 background indicating that they,like APOBEC1 and AID, potentiate dU/dG mispairs, substrates for UDG andsubsequent repair (Lindahl). In contrast, cells transformed with a humanAPOBEC2 expression construct showed neither increased mutationfrequencies (ung⁺ or ung⁻ backgrounds; Table 3 bottom) nor asignificantly altered rpoB mutation spectrum (data not shown).

That APOBEC3C and APOBEC3G also act like APOBEC1 and AID is supportedfurther by a near complete shift in the spectrum of mutations that yieldRif^(R), from 27% (32/120) dC/dG→dT/dA transitions in vector-transformedcells to 94% (102/108) and 88% (81/92) in APOBEC3C- andAPOBEC3G-transformed cells respectively (FIG. 7). Moreover, a directcomparison of the dC/dG base pairs targeted by APOBEC3C and APOBEC3Gwith those targeted by APOBEC1 and AID revealed obvious biases, the moststriking of which was mutation of C1691 bp APOBEC3G (71/92 mutationscompared to 4/120 for vector-transformed cells). APOBEC3C, on the otherhand, shared one hotspot with APOBEC1 (44/108 at C1535) and another withAID (23/108 at C1576), and appeared to be slightly more promiscuouscausing dC/dG->dT/dA transition mutations at eight positions in rpoB(FIG. 7 b).

Example 2 APOBEC1 Expressed in E. coli can be Used to Mutate aHeterologous Gene Integrated into the Chromosome.

The Bacillus subtilis gene SacB is toxic to E. coli in the presence ofsucrose. SacB is cloned under the control of the E. coli promoter forPhoB and the cassette integrated into the chromosome of E. coli strainDH10b at the Lambda phage attachment site using pRB700 (FIG. 8 a), aderivative of the CRIM system plasmid pSK50A-uidA2 (Haldimann et al 1996Proc Natl Acad Sci USA. 93(25):14361-6., Haldimann and Wanner 2001 J.Bacteriol. 183(21):6384-93). The PhoB promoter is active underconditions of low inorganic phosphate availability. Thus mutants in theSacB cassette can be selected by growing independent coloniestransfected with either an APOBEC-1 expression construct or a controlplasmid to saturation (using one fortieth of the colony as the inoculum)and plating on minimal MOPS medium containing 5% sucrose and limitingphosphate.

PCR and subsequent sequencing of the integrated SacB genes in thesesucrose resistant colonies demonstrates that spontaneous mutants at thislocus arise primarily by transposon insertion (and therefore generate asignificantly larger than expected PCR product). This accounted for13/16 spontaneous mutations. In contrast, point mutations predominatewhen APOBEC-1 is expressed. Furthermore, these point mutations areoverwhelmingly (32/33) transitions at C and G, consistent with thesemutations arising by deamination of dC as expected (FIG. 9).

Enhanced Targeting of Mutation can be Achieved by Inducing ConvergentPromoters Upstream and Downstream of the Desired Gene.

The dependence of mutation caused by APOBEC-1 at this locus ontranscription is investigated. APOBEC-1 increases the mutation frequencyat SacB approximately 12-fold when colonies are grown in rich medium,and growth in medium containing limiting phosphate does not appear toenhance mutation at this locus. To investigate the possibility thattranscription in both directions might be required to show an increasein mutation frequency, a variant SacB cassette in pRB740 is created,under the control of the same PhoB promoter, additionally placing thestrong IPTG inducible Trc promoter downstream of the SacB gene in theopposite orientation (FIG. 8 b).

The mutation frequency following growth in rich medium with IPTG of SacBin this case is comparable to that of the original cassette without theconvergently orientated Trc promoter, and so is the spectrum of pointmutations obtained (20/20 are transitions at C or G), indicating thatthis variant SacB cassette does not mutate with an appreciably higherfrequency when transcription is induced only in the antisense direction.

However, following growth in limiting phosphate together with IPTG, themutation frequency is enhanced approximately 1000 fold above thatachieved either by APOBEC-1 without the downstream promoter under thesame conditions, or with the downstream promoter in rich medium (FIG. 10a, b).

Thus, activation of convergent promoters located on opposite sides ofthe gene is able to enhance APOBEC-1 induced mutation at that locus veryappreciably. Furthermore, expression of the less mutagenic APOBEC familymembers AID and APOBEC-3G under these conditions of convergenttranscription also gives rise to a significant increase in mutationfrequency above background and a shift towards the expected PCR productsize, indicating that transposon insertions are responsible for a lowerproportion of the observed mutants. The expected PCR size product isobtained in 10/10 and 7/10 cases for AID and APOBEC3G respectively,compared to only 2/10 and 5/10 respectively under non-transcribingconditions (FIG. 11).

Under conditions of bi-directional transcription, transitions at C or Gaccount for 8/8 and 5/5 of the point mutations observed for AID andAPOBEC3G respectively (FIG. 11). Taken together, these resultsdemonstrate that targeted deamination by members of the APOBEC familycan be achieved if the desired gene to be targeted is placed betweenconvergent promoters (FIG. 11).

Example 3 Deamination of Cytosine to Uracil in DNA can be Achieved InVitro Using Partially Purified APOBEC1 from Extracts of TransformedEscherichia coli. Plasmids and Bacteria

The pTrc99- and pET-based expression vectors for rat APOBEC1 and itsE63->A mutant, for human APOBEC2 and for E. coli dCTP deaminase as wellas the E. coli host strains have been described previously (Rada et al.(2002) Curr. Biology 12, 1748-1755, Randerath K., and Randerath E.(1967) Method Enzymol. 12, 323-347). The pTrc99- and pET-based vectorsdiffer both in the nature of the promoter used (pTrc99 uses the trp/lachybrid promoter whereas pET uses the T7 promoter) and in the length ofheterologous peptide linked to the amino-terminus of the recombinantprotein (9 amino acid with pTrc99 but 34 amino acid with pET (Rada etal. (2002) Curr. Biology 12, 1748-1755., Randerath K., and Randerath E.(1967) Method Enzymol. 12, 323-347)).

Oligodeoxyribonucleotides

The oligodeoxyribonucleotides used are listed in Table 4.

Preparation of Recombinant APOBEC1

A 2 ml overnight culture of a fresh E. coli transformant grown in LB,0.2% Glucose, 50 μg/ml carbenicillin was diluted into 300 ml of the samemedium and grown at 37° C. to an A₆₀₀ of 0.8. The culture was chilled onice for 20 min and then incubated with aeration for 16 h at 16° C. inthe presence of inducer (1 mM IPTG). Cells were harvested bycentrifugation, washed and resuspended in 20 ml H buffer (50 mMTris.HCl, pH7.4, 50 mM KC1, 5 mM EDTA, 1 mM DTT and a protease inhibitorcocktail [Roche]).

Following sonication and ultracentrifugation (100,000 g for 45 min), thesupernatant was passed through a 0.2 μm filter and applied to aSepharose Fast-Flow Mono-Q column (Amersham Biosciences; 10 ml bedvolume). After washing with seven column volumes of buffer H, boundproteins were eluted in buffer H supplemented with increasing saltconcentrations (from 50 to 1500 mM Cl) collecting 15 ml fractions.Fractions and flow-through were concentrated one-hundred fold usingVivaSpin concentrators (M_(r) 10,000 cut-off) (VivaScience) and assayed.Samples eluting with 1000-1500 mM salt were pooled and loaded in avolume of 0.5 ml onto a HighPrep Sephacryl S-200 High-Resolution 16/60gel-filtration column (Amersham Biosciences) in buffer H. Fractions (1ml) were collected and concentrated twenty fold before analysis.

TLC-Based Deaminase Assay

Samples (2-4 μl) were incubated at 37° C. for 5 h in 20 μl of buffer R(40 mM Tris pH 8, 40 mM KCl, 50 mM NaCl, 5 mM EDTA, 1 mM DTT, 10%glycerol) containing 75,000 cpm of α-[³²P]dC-labelled single-strandedDNA (prepared by a 3 min heating to 95° C. of the products of asymmetricPCR amplification of the lad region in pTrc99 performed usingα-[³²P]dCTP (3000 Ci/mmol)). Following phenol extraction and ethanolprecipitation, the DNA was digested with Penicillium citrinum P1nuclease (Sigma) overnight at 37° C. (Grunau C., Clark S. J., andRosenthal A. (2001) Nucleic Acids Res. 29, E65) and the P1 digests thensubjected to thin layer chromatography on PEI-cellulose in either (i)0.5 M LiCl at 4° C. or (ii) at room temperature in 1 M CH₃COOH until thebuffer front had migrated 2.5 cm and then in 0.9 M CH₃COOH:0.3M LiCl(Cohen, R. M., and Wolfenden, R. (1971) J. Biol. Chem. 246, 7561-7565).Products were detected using a phosphorimager. Chemical deamination ofcytosine in DNA using bisulfite/hydroquinone was performed as described(Yamanaka et al. (1995) Proc. Nat. Acad. Sci. USA, 92, 8483-8487).

UDG-Based Deaminase Assay

Samples (1-2 μl) were incubated at 37° C. for 2 h in 10 μl of buffer Rwith 5′-biotinylated oligonucleotides that either were synthesized withfluorescein at their 3′-ends (3 pmol of oligonucleotide per reaction) orwere 3′-labelled by ligation with α-[³²P]dideoxyadenylate (100,000 cpm;0.1 pmol) using terminal deoxynucleotidyl transferase.

Reactions were terminated by heating to 90° C. for 3 min andoligonucleotides purified on streptavidin magnetic beads (Dynal),washing at 72° C. (except in FIG. 2A, where the streptavidinpurification step was omitted). Deamination of cytosine in theoligonucleotides was monitored by incubating the bead-immobilisedoligonucleotides at 37° C. for 30 min with excess uracil-DNA glycosylase(0.5 units UDG; enzyme and buffer from NEB) and then bringing the sampleto 0.15M in NaOH and incubating for a further 30 min. Theoligonucleotides were then subjected to electrophoresis on 15% PAGE-ureagels which were developed by either fluorescence detection orphosphorimager analysis.

Western Blotting

Western blot detection of APOBEC1 following SDS/PAGE of samples that hadbeen diluted 20-100 fold was performed using a goat-anti-APOBEC1 serum(Santa Cruz Biotechnology), developing with horseradishperoxidase-conjugated donkey anti-goat immunoglobulin antiserum (BindingSite, Birmingham, UK). Low-range molecular weight markers were fromBioRad.

Results DNA Deamination Assay in Cell Extracts

Since, of all the APOBEC family members tested, APOBEC1 displayed themost potent mutator activity in the E. coli mutation assay (Randerath K.and Randerath E. (1967) Method Enzymol 12, 323-347), APOBEC1-transformedE. coli were investigated in order to see if DNA deamination activity invitro using cell extracts could be detected.

Initially, the UDG-based deaminase assay was tried, working with anoligodeoxyribonucleotide substrate. However, no evidence of deaminationwas obtained using double-stranded oligonucleotide substrates whereassingle-stranded oligonucleotides were rapidly degraded by both APOBEC1and control extracts (data not shown). The possibility that the DNAdeaminating activity might be specific for single-stranded substratesbut that this activity might be masked by non-specific nucleases wasinvestigated. An assay that would be less sensitive to contaminatingnucleases (FIG. 12A) was devised.

The bacterial extracts were incubated with α-[³²P]dC-labelledsingle-stranded DNA which was then purified, digested with nuclease P1and subjected to thin-layer chromatography to test for the presence ofα-[³²P]dUMP. Clear evidence of dC deamination in this assay was detectedusing extracts of E. coli expressing two different APOBEC1 constructsbut not from control extracts or from extracts made from E. coli cellscarrying plasmids expressing mutant APOBEC1, APOBEC2 or dCTP deaminase(none of which function as DNA mutators in the bacterial assay(Randerath K. and Randerath E. (1967) Method Enzymol 12, 323-347)) (FIG.12B). The DNA deaminase activity was evident in APOBEC1-transformants ofa mutant E. coli deficient in both dcd- and cdd-encoded deaminases (FIG.12B (iii)). That the product of APOBEC1 action was indeed dUMP isindicated by the co-migration of the radioactive product with dUMP intwo distinct buffer systems.

These results suggested fractionation of the extracts ofAPOBEC1-transformed E. coli to see if the DNA deamination activity couldbe sufficiently separated from non-specific nucleases so as to bedetectable using the oligonucleotide cleavage assay.

Partial Purification

Pilot experiments revealed that ion-exchange chromatography could beused to obtain samples of APOBEC1 that contained diminished non-specificnuclease activity. Thus, whilst only a proportion of the APOBEC1polypeptide bound to the Mono-Q column (around 10-20% based on ECLquantitation of the Western blot assay), elution of this bound fractionwith >0.8 M CI yielded a sample that displayed cytosine-DNA deaminationactivity (as monitored using the TLC-based assay) but containingdiminished non-specific nuclease activity in the UDG-based assay (FIG.13A). These fractions were then concentrated and subjected to gelfiltration (FIG. 13B). The major APOBEC1 peak eluted in fractions 7-9(corresponding to an M_(r) of 95-140,000) co-eluting with peak DNAdeaminating activity. Indeed, with these fractions from the gelfiltration column, DNA deamination could now readily be detected by theUDG-based assay using a single-stranded oligonucleotide substrate(although the peak fractions also contained activity that removed the3′-label from the oligonucleotide). Mass spectrometric analysis ofproteins in fraction 9 following SDS/PAGE revealed the recombinantAPOBEC1 migrating at the position marked by the asterisked in FIG.13B(i) although the majority of the bands derived from ribosomalproteins.

Characteristics of the DNA Deaminating Activity

The UDG-based deaminase assay was used to monitor the specificity andcharacteristics of the partially purified APOBEC1 (FIG. 14A). Sampleswere incubated with a single-stranded oligodeoxyribonucleotide (with orwithout its complement) which contained internal dC residue(s) and thatwas 5′-biotinylated as well as 3′-labelled. After purification onstreptavidin, the oligonucleotide was treated with UDG (plus alkali),resulting in site-specific cleavage if the oligonucleotide had beensubjected to dC->dU deamination. Thus, deamination is read out by theappearance of the specific cleavage product following PAGE-ureaanalysis.

The partially-purified wild type protein (but not the E63->A mutant)showed clear activity on a single-stranded oligonucleotide with thecleavage being dependent on the subsequent incubation with UDG (FIG.14B, C). The deaminating activity was not inhibited by tetrahydrouridine(which inhibits cytidine deaminases (Frederico et al. (1990)Biochemistry, 29, 2532-2537)) or by RNAse (FIG. 14D). Strikingly [andconsistent with our inability to detect deamination on double-strandedoligonucleotide substrates using crude extracts of bacterialtransformants (see above)], the activity was blocked if a complementary(but not if an irrelevant) oligonucleotide was titrated into the assay(FIG. 14E). Examination of the cleavage products generated in theUDG-based assay suggests that not all dC residues are equallysusceptible to APOBEC1-mediated deamination. It is clear, for example,that in oligonucleotide SPM168 the third cytosine in the sequence TCCGCGis much less favoured than the other two (FIG. 14B-E). Similarly,evidence of specificity comes from comparing various relatedoligonucleotides as substrates, where all the data taken together pointto deamination being especially disfavoured when a purine is locatedimmediately 5′ of the cytosine (FIG. 14F).

Discussion

The results described here provide biochemical evidence thatAPOBEC1-mediated deamination of cytosine to uracil can occur onsingle-stranded DNA, is dependent on local sequence context and isabolished by mutation of the APOBEC1 zinc-coordination motif. Unlike AID(where genetic evidence indicates that the natural physiologicalsubstrate of deamination is DNA (Harris et al. (2002) Mol. Cell. 10,1247-1253, Wagner et al. (1989) Proc. Nat. Acad. Sci. USA, 86,2647-2651), the major physiological substrate of APOBEC1 is clearlyapolipoprotein B RNA (Teng et a (1993) Science 260, 1816-1819, Blanc, V.and Davidson, N. O. (2003) J. Biol. Chem. 278, 1395-1398). Nevertheless,the observation that misexpression of APOBEC1 in transgenic micepredisposes to cancer suggests that APOBEC1-mediated DNA deaminationcould well be of pathological relevance.

Given the abundance of APOBEC1 polypeptide in the peak fraction from thegel filtration column, it appears that—on average—each molecule ofrecombinant APOBEC1 is responsible for in the order of a singledeamination event in a 10 minute incubation in the UDG-based assay.Crude calculations indicate that if the ˜500 molecules of APOBEC1expressed in each E. coli transformant displayed a DNA deaminationactivity of this order in vivo and if this were targeted randomly to allcytosine residues in the genome, then this could, in principle, be morethan sufficient to account for the several thousand-fold enhancedmutation frequencies seen at the rpoB and other loci in UDG-deficient E.coli following 20 generations of growth (Randerath K. and Randerath E.(1967) Method Enzymol 12, 323-347). Similarly, somatic hypermutation ofimmunoglobulin variable genes by targeted AID-mediated dC deaminationmay involve a single and most probably less than ten targeted dCdeamination events in each B lymphocyte cell cycle.

The results provide information about the preferred target ofAPOBEC1-mediated DNA deamination. The in vitro assay reveals a clearsensitivity to the local sequence context of the dC residue to bedeaminated. The results obtained here suggest there may be bias againsta 5′-flanking purine residue. This would accord well with the in vivodata where a near-total restriction to mutation at dC residues with a5′-flanking pyrimidine is seen at the rpoB locus (Randerath andRanderath).

The in vitro assay also reveals that APOBEC1 deamination is targeted tosingle-stranded DNA and, indeed, was undetectable on double-strandedDNA. This specificity for single-stranded DNA is in accordance with thefact that the natural substrate of APOBEC1 is most likelysingle-stranded RNA (Blanc and Davidson) and, presumably, the sameactive site in APOBEC1 is used for both types of polynucleotide.Furthermore, spontaneous deamination of cytosine is also much more rapidin single- (as opposed to double-) stranded DNA which may explain thecorrelation with transcription of the DNA target gene described hereinand where convergent promoters increase the availability ofsingle-stranded DNA to APOBEC-1.

Example 4 Expression of Apobec-1 Fusion Proteins

The Apobec-1 expression plasmid was generated as described above but anucleic acid encoding rat Apobec1 with an aminoterminal fusion encoding:

(SEQ ID NO: 15)Met-His-His-His-His-His-His-His-His-Tyr-Asp-Ile-Pro-Thr-Ala-Ser-Glu-Asn-Leu-Tyr-Phe-Gln-Gly-Ser- joining to the initiator Met of Apobec-1

The expression construct was expressed from in E. coli strain BL21 DE3(purchased from Novagen) and the effect on the frequency of mutation torifampicin-resistance (Rif^(R)) measured by fluctuation analysis asdescribed above.

The results are as follows:

Rif R colonies vector alone 42 35 28 23 His-Apobec-1 3000 3000 2000 1500(The numbers are numbers of Rifr colonies in 4 independent experiments,the experiments being performed as in Tables 1 and 2).

This demonstrates that the Apobec fusion protein with a His-tag fused toits N-terminus retains mutator activity in E. coli.

Example 5 Hybridisation Experiments

A cancer profiling array was obtained from Clontech (Cat. No. 7757-1)and hybridised as directed with the following ³²P-dCTP-labeled humancDNA probes: APOBEC1 (IMAGE clone 2107422), APOBEC3G (IMAGE clone1284557) and ubiquitin (control provided with array). The array washybridised first with APOBEC1, subsequently with APOBEC3G, and finallywith ubiquitin. After each hybridisation the probe was removed byboiling in 0.5% SDS. Hybridisation images, shown in FIG. 15, werevisualised with the Typhoon Phosphoimaging System (Pharmacia) andImageQuant software. Data are grouped by tissue to facilitatecomparison, although the entire blot (representing all tissues shown)was hybridised simultaneously as a single filter in each experiment(i.e. with each probe) and the autoradiographic image subsequentlyseparated by computer manipulation (without adjusting gain orbackground).

Results

APOBEC1 expression appears to be restricted to gastrointestinal tissues(colon, stomach, rectum, and small intestine), whereas APOBEC3G wasexpressed to some extent in all tissues examined. Perhaps most notableis the fact that for some tumour samples, APOBEC1 (colon and rectum) andAPOBEC3G (breast and kidney) appear better expressed than incorresponding normal tissues (only intra-hybridisation pairs should beconsidered). Note also that for APOBEC1 hybridisation of stomach samplesthe opposite may be the case.

REFERENCES

-   1. Muramatsu, M., Sankaranand, V. S., Anant, S., Sugai, M.,    Kinoshita, K., Davidson, N. O. & Honjo, T. Specific expression of    activation-induced cytidine deaminase (AID), a novel member of the    RNA-editing deaminase family in germinal center B cells. J. Biol.    Chem. 274, 18470-18476 (1999).-   2. Muramatsu, M., Kinoshita, K., Fagarasan, S., Yamada, S.,    Shinkai, Y. & Honjo T. Class switch recombination and hypermutation    require activation-induced cytidine deaminase (AID), a potential RNA    editing enzyme. Cell 102, 553-563 (2000).-   3. Revy, P., Muto, T., Levy, Y., Geissmann, F., Plebani, A.,    Sanal, O. et al. Activation-induced cytidine deaminase (AID)    deficiency causes the autosomal recessive form of the Hyper-IgM    syndrome (HIGM2). Cell 102, 565-575 (2000).-   4. Arakawa, H., Hauschild, J. & Buerstedde, J. M. Requirement of the    Activation-Induced Deaminase (AID) gene for immunoglobulin gene    conversion. Science 295, 1301-1306 (2002).-   5. Harris, R. S., Sale, J. E., Petersen-Mahrt, S. K. &    Neuberger, M. S. AID is essential for immunoglobulin V gene    conversion in a cultured B cell line. Curr. Biol. 12, 435-438    (2002).-   6. Martin, A., Bardwell, P. D., Woo, C. V. J., Fan, M.,    Shulman, M. J. & Scharff, M. D. Activation-induced cytidine    deaminase turns on somatic hypermutation in hybridomas. Nature 415,    802-806 (2002).-   7. Okazaki, L, Kinoshita, K., Muramatsu, M., Yoshikawa, K., &    Honjo T. The AID enzyme induces class switch recombination in    fibroblasts. Nature 416, 340-345 (2002).-   8. Maizels, N. Somatic hypermutation: how many mechanisms diversify    V region sequences? Cell 83, 9-12 (1995).-   9. Weill, J. C. & Reynaud, C. A. Rearrangement/hypermutation/gene    conversion: when, where and why? Immunol Today 17, 92-97 (1996).-   10. Sale, J. E., Calandrini, D. M., Takata, M., Takeda, S. &    Neuberger, M. S. Ablation of XRCC2/3 transforms immunoglobulin V    gene conversion into somatic hypermutation. Nature 412, 921-926    (2001).-   11. Ehrenstein, M. R. & Neuberger, M. S. Deficiency in Msh2 affects    the efficiency and local sequence specificity of immunoglobulin    class-switch recombination: parallels with somatic hypermutation.    EMBO J. 18, 3484-3490 (1999).-   12. Rada, C., Ehrenstein, M. R., Neuberger, M. S. & Milstein, C. Hot    spot focusing of somatic hypermutation in MSH2-deficient mice    suggests two stages of mutational targeting. Immunity 9, 135-141    (1998).-   13. Wiesendanger, M., Kneitz, B., Edelmann, W., & Scharff, M. D.    Somatic hypermutation in MutS homologue (MSH)3-, MSH6-, and    MSH3/MSH6-deficient mice reveals a role for the MSH2-MSH6    heterodimer in modulating the base substitution pattern. J. Exp.    Med. 191, 579-584 (2000).-   14. Lindahl T. Suppression of spontaneous mutagenesis in human cells    by DNA base excision-repair. Mutat. Res. 462, 129-135 (2000).-   15. Sale, J. E. & Neuberger, M. S. TdT-accessible breaks are    scattered over the immunoglobulin V domain in a constitutively    hypermutating B cell line. Immunity 9, 859-869 (1998).-   16. Manis, J. P., Tian, M. & Alt, F. W. Mechanism and control of    class-switch recombination. Trends Immunol. 23, 31-39 (2002).-   17. Petersen S., Casellas R., Reina-San-Martin, B., Chen, H. T.,    Difilippantonio, M, J, et al. AID is required to initiate    Nbs1/gamma-H2AX focus formation and mutations at sites of class    switching. Nature 414, 660-665 (2001).-   18. Chen, X., Kinoshita, K. & Honjo, T. Variable deletion and    duplication at recombination junction ends: implication for    staggered double-strand cleavage in class-switch recombination.    Proc. Natl. Acad. Sci. USA. 98, 13860-13865 (2001).-   19. Sung, J., Bennett. S. E. & Mosbaugh, D. W. Fidelity of    uracil-initiated base excision DNA repair in Escherichia coli cell    extracts. J. Biol. Chem. 276, 2276-2285 (2001).-   20. Mokkapati, S. K., Fernandez de Henestrosa, A. R, Bhagwat, A. S.    Escherichia coli DNA glycosylase Mug: a growth-regulated enzyme    required for mutation avoidance in stationary-phase cells. Mol.    Microbiol. 41, 1101-1111 (2001).-   21. Schouten, K. A. & Weiss, B. Endonuclease V protects Escherichia    coli against specific mutations caused by nitrous acid. Mut. Res.    435, 245-254 (1999).-   22. He, B., Qing, H. & Kow, Y. W. Deoxyxanthosine in DNA is repaired    by Escherichia coli endonuclease V. Mut. Res. 459, 109-114 (2000).-   23. Betz, A. G., Milstein, C., Gonzalez-Fernandez, A., Pannell, R.,    Larson, T., Neuberger, M. S. Elements regulating somatic    hypermutation of an immunoglobulin kappa gene: critical role for the    intron enhancer/matrix attachment region. Cell 77, 239-248 (1994).-   24. Jarmuz A., Chester A., Bayliss J., Gisbourne J., Dunham L, et    al. An Anthropoid-Specific Locus of Orphan C to U RNA-Editing    Enzymes on Chromosome 22. Genomics 79, 285-296 (2002).-   25. Yamanaka, S., Balestra, M. E., Ferrell, L. D., Fan, J., Arnold,    et al. Apolipoprotein B mRNA-editing protein induces hepatocellular    carcinoma and dysplasia in transgenic animals. Proc. Natl. Acad.    Sci. USA. 92, 8483-8487 (1995).-   26. Shen, J.-C., Rideout, W. M. & Jones, P. A. High frequency    mutagenesis by a DNA methyltransferase. Cell 71, 1073-1080 (1992).-   27. Navaratnam, N., Fujino, T., Bayliss, J., Jarmuz, A., How, et al.    Escherichia coli cytidine deaminase provides a molecular model for    ApoB RNA editing and a mechanism for RNA substrate recognition. J.    Mol. Biol. 275, 695-714 (1998).-   28. Selker, E. U. Premeiotic instability of repeated sequences in    Neurospora crassa. Ann. Rev. Genet. 24, 579-613 (1990).-   29. Jin, D. J. & Zhou, Y. N. Mutational Analysis of    structure-function relationship of RNA polymerase in Escherichia    coli. Methods Enzymol. 273, 300-319 (1996).-   30. Jarmuz et al. Genomics 79, 285 (2002).

All publications mentioned in the above specification, and referencescited in said publications, are herein incorporated by reference.Various modifications and variations of the described methods and systemof the present invention will be apparent to those skilled in the artwithout departing from the scope and spirit of the present invention.Although the invention has been described in connection with specificpreferred embodiments, it should be understood that the invention asclaimed should not be unduly limited to such specific embodiments.Indeed, various modifications of the described modes for carrying outthe invention which are obvious to those skilled in molecular biology orrelated fields are intended to be within the scope of the followingclaims.

1. A cell comprising a nucleic acid encoding an Activation InducedDeaminase (AID) polypeptide, or an AID variant, derivative or homologue,and having a mutator phenotype.
 2. The cell of claim 1, wherein saidcell is a prokaryotic cell.
 3. The cell of claim 1, wherein said cell isan eukaryotic cell.
 4. The cell of claim 1, wherein the AID homologue isApo1bec-1, Apobec-1, Apobec3C or Apobec3G.
 5. A fusion proteincomprising an AID polypeptide, or AID variant, derivative or homologuethereof, having a mutator phenotype operably linked to one half of aspecific binding pair.
 6. The fusion protein of claim 5, wherein saidone half of a specific binding pair is a DNA binding domain.
 7. A vectorcomprising a nucleic acid encoding the fusion protein of claim
 5. 8. Acell comprising a nucleic acid encoding the fusion protein of claim 5.9. A method for preparing a gene product having a desired activity,comprising the steps of: a) expressing a nucleic acid encoding the geneproduct in a population of cells according to claim 1; b) identifying acell or cells within the population of cells which expresses a mutantgene product having the desired activity; and c) establishing one ormore clonal populations of cells from the cell or cells identified instep (b), and selecting from said clonal populations a cell or cellswhich expresses a gene product having an improved desired activity. 10.A method as claimed in claim 9, wherein the nucleic acid encoding thegene product is operably linked to the second half of a specific bindingpair.
 11. A method of directing mutation to a specific gene product ofinterest comprising: i) generating a nucleic acid construct comprising anucleic acid sequence encoding a specific gene product operably linkedto a DNA binding protein recognition sequence; ii) transfecting saidnucleic acid construct into a population of host cells expressing thefusion protein of claim 6; ii) incubating said transfected host cellsunder conditions suitable for allowing the specific binding pairing ofDNA binding protein to DNA binding protein recognition sequence tooccur; iv) identifying a cell or cells within the population of cellswhich expresses a mutant gene product having the desired activity; andv) establishing one or more clonal populations of cells from the cell orcells identified in (iv), and selecting from said clonal populations acell or cells which expresses a gene product having an improved desiredactivity.
 12. A method of identifying components of AID-dependentmutation activity comprising expressing AID in a cell deficient inexpression or activity of a known gene and assessing mutator activitycompared to activity in a cell expressing said gene.
 13. A method ofscreening for a modulator of AID activity comprising: (a) expressing AIDin a prokaryotic cell; (b) maintaining the AID-expressing prokaryoticcell in the presence of a selectable medium; (c) detecting the presenceof colonies in the absence or presence of a test compound, wherein amodified number of colonies when compared to a sample in the absence ofa test compound is indicative of the ability of the test compound tomodify AID mutator activity.
 14. A method of inducing a mutation in acell comprising administering an AID polypeptide or functional homologuethereof.
 15. A method for treating a disorder characterized by anincreased mutation rate, comprising administering an agent that modifiesAID functional activity or gene expression.
 16. A method of decreasinghypermutation/resistance to a compound such as an antibiotic in apopulation of bacteria comprising modulating activity of a bacterial AIDhomologue.
 17. A construct for use in the method of claim 11, saidconstruct comprising a coding sequence for the gene product of interest,wherein said coding sequence is placed under the control of a firstpromoter upstream of the coding sequence and further comprising a secondpromoter downstream of the coding sequence, wherein said first andsecond promoters are arranged in opposing orientation so as to allowconvergent transcription of the coding sequence.